Reading in A Second Language Process Product and Practice
Reading in A Second Language Process Product and Practice
GENERAL EDITOR
CHRISTOPHER N. CANDLIN,
Chair Professor of Applied Linguistics
City University of Hong Kong, Hong Kong
For a complete list of books in this series see pages v-vi.
Reading in a Second
Language: Process,
Product and Practice
GENERAL EDITOR
CHRISTOPHER N. CANDLIN
Chair Professor of Applied Linguistics
City University of Hong Kong
Hong Kong
Error Analysis The Classroom and the Language
Perspectives on Second Language Learner
Acquisition Ethnography and second-language
JACK C. RICHARDS (ED.) classroom research
LEO VAN LIER
Stylistics and the Teaching of
Literature Observation in the Language
H.G. W ID D O W SO N
Classroom
Contrastive Analysis DICK ALL W RIGHT
CARL JAMES
Learning to Write Listening in Language Learning
MICHAEL ROST
First Language/Second Language
AVIVA FREEDMAN, IAN PRINGLE and
JANICE YALDEN (EDS)
Listening to Spoken English
Second Edition
Language and Communication GILLIAN BROWN
JACK C. RICHARDS and RICHARD W.
SCH M IDT (EDS) An Introduction to Second
Reading in a Foreign Language Language Acquisition Research
J. CHARLES ALDERSON and A.H. DIANE LARSEN-FREEMAN and
UR Q UH ART (EDS) MICHAEL H. LONG
Contents
Editors ’ acknowledgements xi
Publisher's acknowledgements xii
General Editor’s preface xiii
Introduction 1
Literacy 1
The scope of the term ‘literacy’ 5
Literacy and power 6
Cognitive v Social 8
Organisation of the book 10
1 Prelim inaries 13
What is reading? 13
The primacy of spoken language 24
Writing systems 27
Reading versus listening 31
The L2 reader 33
Summary 34
Editors’ acknowledgements
The authors would like to thank Chris Candlin for his support
and guidance throughout the writing of this book. His insightful
comments and firm grasp of the whole domain of applied lin
guistics have been invaluable. They would also like to thank Jin
Yan and Luo Peng from Shanghai Jiaotong University, PRC, and
other members of the reading research group at CALS, University
of Reading, for their constructive feedback and help in clarifying
our thoughts on the new blueprint for reading developed in this
book, and in particular for their help in developing our taxonomy
of reading skills and strategies.
Cyril Weir offers his gratitude to all his colleagues in CALS for
their understanding over the past few years and, in particular, to
Don Porter and Eddie Williams for their support and encourage
ment. He would also like to thank Lois Archer, Shigeko Amano
and Jessica Wu for their close reading of the manuscript. Ron
White deserves a special acknowledgement for being one of the
rare breed of managers who is both liked and respected by his
colleagues. His view of the role of management as bringing the
best out of his staff rather than trying to get the most out of them
is much appreciated. CALS generously awarded this author sab
batical leave both to start and finish this book.
Sandy Urquhart would like to thank his family for their support.
xii
Publisher’s acknowledgements
Introduction
Literacy
The teacher of reading is in the business of attempting to improve
literacy. Literacy has been the focus of a great deal of work over
the last few decades. By discussing, however briefly, some of this
work, we shall hopefully set this book against a broad and mean
ingful context.
The definition of literacy is crucial (cf. Baynham, 1995 and
Venezky et al., 1990). One implied definition, touching in its
2 Reading in a Second Language
optimism, can be found in the US Census of 1940, where a per
son is judged literate if they are 10 years of age or older and has
completed 5 or more years in school (Newman and Beverstock,
1990). For serious discussion nowadays, the starting point must be
UNESCO proposals of the 1950s that literacy should be defined in
terms of minimal and functional literacy (see Venezky et al., 1990).
The former refers to the ability to read and write a simple mes
sage; the latter to a level of literacy sufficiently high for a person
to function in a social setting. Prior to this, attempts were made to
define what Street (1995: 76) refers to as ‘autonomous literacy’,
that is, literacy divorced from any context. Such a hypothetical
level is impossible to establish and is inappropriate in many situ
ations. As Venezky et al. (1990: ix) say:
the diverse communities that make up contemporary America are
so variegated that simple dichotomies such as literate-illiterate fail
to capture what are real differences in what people know and how
they behave in certain situations.
The decision to take into account the social relevance of literacy
has been momentous, leading as it has to modern notions of mul
tiple ‘literacies''. The concept can apply both to attitudes towards
the value of literacy, and to the role of literacy in the society as a
whole. Street (1995) comments on problems caused by a clash in
such values between outside educators and local values in Melan
esia and among Amish communities in the USA.
In many parts of modern Western society, illiteracy in an adult
is seen as a stigma, which the person concerned is often at pains to
disguise. In a novel by Ruth Rendell, a character becomes alien
ated from society as a result of such subterfuges, grows up devoid
of normal social emotions, and eventually murders her employers
in order to disguise the fact that she cannot take a message.1 In
contrast, in a well-known article, Fingeret (1983) describes how,
in urban communities in the USA, illiteracy is not a stigma, help
with filling in forms, etc., being exchanged for other skills like car
repairing.2 It is likely that large and important cultures exist which
view literacy in a way very different from our own. Certainly it has
been our experience in the L2 classroom that some students do
not seem to view written material as a potential source of informa
tion accessible to the individual reader. If this observation is valid,
the students’ behaviour may be a result of views of the role of
literacy in their society. To remedy this, an intensified training in
Introduction 3
reading skills may not be sufficient; possibly more generalised
approaches to the uses of written material might work.
The different levels of literacy required by different groups can
be seen when one considers different occupations or professions
in the same society. It seems obvious that different occupations
make greater professional demands of the written word than others.
Tudge - arguing in New Scientist that the population divides into
those who like reading and those who do not, and that the educa
tion system should take this into account - notes that trainee sur
geons do not need to make much use of the printed word (they
can rely on video-tapes, computer simulations, etc.) whereas lawyers
do (Tudge, 1987). It is also obvious that the required level of pro
fessional literacy varies radically over time. It seems likely that, in
the last century, plumbers, for example, had comparatively low
professional requirements for literacy. This is not likely to be the
case now.
L2 students in the language class are often preparing to study
on academic or other courses conducted in the L2. Such courses,
e.g. engineering, nursing, can be assumed to have their own literacy
requirements and criterion levels. It is clearly of vital importance
that test and syllabus designers take these different requirements
and levels into account. In Chapter 3 we discuss this topic under
task- and text-based factors. Particularly in LSP, Needs Analysis has
long been part of the design of courses (cf. Munby, 1978; Nunan,
1993). It is our opinion, however, that detailed ethnographic case
studies of the needs of different study areas have not been con
ducted, or at least published. The difficulty of conducting such
studies should not be underestimated.
Related to this is the notion of transfer of literacy from one area
to another. Implicit in what has been said above, reading skills
acquired in the reading class should be transferred to the student’s
eventual study area. However, the literature on transfer tends to
be pessimistic. Mikulecky (1990) claims that a major misconception
in literacy studies is that ‘mastering literacy in one context sub
stantially transfers to other contexts’, and adds ‘Transfer of literacy
abilities is severely limited by differences in format, social support
networks, and required background information as one moves
from context to context’ (p. 25). He contrasts ‘literacy in schools’
involving ‘independent reading for answering questions at the end
of the chapter, or, on some occasions, carefully studying material
to remember, synthesize, or evaluate it [what we later refer to as
4 Reading in a Second Language
“careful local” or “careful global” reading] ’ and claims that these
activities differ from ‘those used to read a troubleshooting manual
on the job or gather information to fill in a form ’ (p. 25). Sticht
(1980) has also claimed that there is little transfer from general
reading ability, and reading in specific situations. He contrasts
‘reading to learn’ in school, with ‘reading to do’ in the workplace,
or ‘reading toward a goal of locating information for immediate
use that need not be recalled later’. While transfer is always a
potential problem, we see behind some of this discussion rather
conventional and unimaginative limitations on the types of read
ing which should be practised in the reading class, and point
below (Chapter 4) to a wider range of tasks and behaviours.
Returning now to multiple literacies, the notion extends to
individuals. A reader may be highly literate in one area but minim
ally literate in another. The present authors are not functionally
literate with respect to, for example, professional scientific texts,
and only marginally so vis-a-vis teenage fanzines. The L2 teacher
of adults will be familiar with the situation in which students in
the class are superior readers in certain professional or other
areas. In such cases, common in LSP situations, the teacher clearly
must see their role as that of facilitator rather than pedagogue.
This aspect of multiple literacy also throws doubt on the concept
of the ‘good reader’, often referred to in the research literature.
Venezky (1990: 12) argues that ‘Most readers show differing read
ing abilities across different types of material’, a claim repeated by
Urquhart (1996). An implication of this is that, for the L2 reader,
we may sometimes need not a single test score but a profile com
posed of performances on different types of text and task. In
Chapter 3 we present evidence from testing to support this view.
What might be viewed as an extreme case of different literacies
within the same individual can be seen when the individual moves
into a different language area. Venezky draws attention to the
immigrant into the USA, illiterate in English but literate in Viet
namese and French; and the reverse, literate Americans who are
functionally illiterate when they move to other parts of the world.
A large proportion of L2 students in reading classes are already
literate in their LI (but see Wallace, 1988; Williams, 1995). A
number of consequences rise from this differential literacy in dif
ferent languages. The fact that many learners are literate in their
LI but not functionally literate in the L2 is the basis for the
notion of ‘transfer of reading skills’ discussed in Chapter 2. More
Introduction 5
generally, LI literacy must be one of the components which readers
bring to the task of L2 reading (cf. Bernhardt, 1991b). Bernhardt
sees ‘literacy’ in this sense as consisting of knowledge of texts, etc.
However, it must be broader than that, including not just, in
many cases, knowledge of script, but also, crucially, knowledge
that written text is language, containing messages from other
language users.
Cognitive v Social
According to Bernhardt (1991b: 6), ‘taking a cognitive perspective
means examining the reading process as an intrapersonal problem
solving task that takes place within the brain’s knowledge struc
tures’. She notes (p. 8) that the critical element in any cognitive
view of reading is that it is an individual act. As a social process,
she cites Bloome and Greene (1984): ‘reading is used to establish,
structure, and maintain social relationships between and among
peoples’.
Bernhardt doesn’t specify the social aspects except to point out
that the same individual and the same text can vary:
the processing of text can be viewed only within a unique cultural
context. . . there are basically no generic or generalized readers or
reading behaviours . . . there are multiple readers within one person
. . . multiple ‘texts’ within a text. (pp. 10-11)
There are, however, many more factors than this in the social
approach to reading.
No text on reading can ignore the social aspects of the activity.
Reading, as the studies of literacy mentioned above have made
Introduction 9
clear, is a social activity, related always to particular contexts. How
ever, there is no doubt that in comparison to a book like Wallace
(1992a) we put more emphasis on the cognitive side. There is
more than one reason for this.
1. We consider that the cognitive aspect is primary. Reading with
out social aspects might be an odd idea, but something like it
exists in the experiments of cognitive psychologists, some of
whom are mentioned in Chapter 2. Reading without cognitive
activity, on the other hand, is simply an impossibility.
2. Wallace remarks, quite correctly, that ‘as readers we are fre
quently addressed in our social roles rather than our personal
and individual ones’ (1992a: 18) and gives as an example the
reader of an advertisement aimed at a social group. We would
claim, however, that while this is clearly relevant to an analyst
of the text, and certainly relevant to the production of the text,
it is not necessarily relevant to readers who probably see them
selves as reading the text as individuals. To this extent, read
ing is always an individual activity. We should not, of course,
fall into the trap of equating ‘individual’ with ‘cognitive’; the
individuals bring their own societies with them. Nevertheless,
there is an element of truth in it - see Bernhardt above.
3. Wallace claims that: ‘Classrooms are themselves communities
with their own uses of literacy and ascribed roles for teachers
and learners.’ We agree, but are conscious that the L2 reading
class is often a very loose, transitory community, more akin in
some ways to the population of an airport lounge. The mem
bers have come from different communities and are intending
to join different communities. The communities that they plan
to join must remain in our consciousness. Again, we see classes
as collections of individuals. And if this is true of classrooms,
how much more is it true of our other prototypical situation,
the examination hall?
4. Finally, in Chapter 4, where we review teaching methods, we
try to focus on those areas where empirical evidence exists in
support of particular practices. It is our impression that more
of such evidence at present exists for those methodologies which
have a cognitive, rather than a social focus.
We do, however, try to take the social aspect of reading explicitly
into account in our handling of comprehension in Chapter 2,
and in the importance we ascribe to authentic tasks and texts in
10 Reading in a Second Language
Chapters 3 and 4. A fuller treatment of the social approach to
reading, including activities for the classroom, can be found in
Wallace (1992a, 1996).
Notes
1. Rendell, R. (1977) A Judgement in Stone.
2. It is perhaps worth noting that, in our own society, highly literate
adults will admit without embarrassment to not being able to ‘read’
music.
3. Wallace (1996) accepts this, suggesting that critical reading is one
strategy available to the reader.
This page intentionally left blank
13
1
Preliminaries
What is reading?
2500 -
2000 -
1500 -
1000 -
500 -
0 -
1966 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
Preliminaries
Figure 1.1 Number of articles and other publications published between 1966 and 1996 that mention ‘reading’ in
19
500 -
400 -
300 -
200 -
100 -
£|
0 -
S'a
1966 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 S.
S
Figure 1.2 Number of articles and other publications published between 1966 and 1996 that mention ‘comprehension’ NO
in their title or in ERIC’s index or abstract (based on data from ERIC). l- J
22 Reading in a Second Language
Reading is the process of receiving and interpreting information encoded in
language form via the medium of print.
This may not be very neat but it suits our purposes.
There seems little doubt that the position of those who consider
reading a secondary process has in the past been reinforced by
the assertion, virtually an axiom in linguistics, that spoken lan
guage is primary. Spoken language, the textbooks tell us, is prim
ary both phylogenetically and ontogenetically, i.e. speech preceded
writing in the history of the human species, and for the normal
individual child, speech also comes first. This is unquestionably
true, and no one nowadays would dispute it. But as Sampson
points out, linguists adopted this position in a programmatic way;
that is, speech was the aspect of language that linguists were sup
posed to study. And one of the reasons for adopting the position
was as a reaction against a previously dominant view that written
language represented the correct form, as opposed to debased, or
‘careless’ speech forms. As part of this reaction, written language
was held to be derivative, and not worthy of study.
Preliminaries 25
From a strictly theoretical point of view, this attitude is at least
unnecessary. De Saussure drew a distinction in language between
form and substance. By form he meant the whole structure of rela
tionships, both contrasts and equivalences, inside the language;
one of the meanings of substance is the medium in which the form
is realised. Just as, in de Saussure’s well-known analogy with chess,
the game can be played with pieces composed of any num ber of
materials or shapes, so language can be ‘realised’ by different
media, in our case by either sounds (spoken language), or shapes
(written language). The implication of this argument is that writ
ten language can be seen as a realisation of a language parallel to
the spoken form. For de Saussure, language was basically form;
the substance in which the form was realised was immaterial. Based
on this distinction, Lyons (1968: 60-1) argues that:
When we say that [t] is in correspondence with t, [e] with e, and in
general that a particular sound is in correspondence with a particu
lar letter, and vice versa,, we can interpret this to mean that neither
the sounds nor the letters are primary, but that they are both altern
ative realizations of the same formal units, which of themselves are
quite abstract elements, independent of the substance in which
they are realized.
Making the same point, Sampson (1980) remarks:
After all, English is still English whether we realise it as spoken
sounds or as ink on paper.
W hether linguists did, in fact, obey the injunction to restrict
themselves to spoken language is doubtful. Brown and Yule (1983)
remark that the language in descriptive linguistic grammars is
often characteristic of the written form. And Lyons (1968) remarks
that in a highly literate society, linguists find it difficult to view
spoken language objectively.
Apart from this, however, the long-standing linguistic prejudice
against written language has recently been eroded. Sampson is
one linguist who considers that written language is a proper area
of study for linguistics. For him the written and spoken languages
are two ‘closely related dialects’, and the influence of one dialect
on the other is not one-way. In an article provocatively titled ‘The
primacy of writing’, Householder (1971) claims that the written
language ‘has probably been the greatest single cause of phono
logical change in modern English, both British and American’. It
26 Reading in a Second Language
is likely that a considerable amount of the vocabulary, syntax and
knowledge of rhetorical structure of a mature native speaker is
learned via the written language.
If this was simply a dispute within linguistics, then it would not
matter to us here. The study of reading may borrow from lin
guistics, but does not belong to the area. The linguistic argument
as to relative primacy is in one sense irrelevant. From the point of
view of a language user at a particular point in time, it is ridicu
lous to argue that speech is always primary. If we meet a neigh
bour and exchange greetings, we choose speech; anything else
would be strange. On the other hand, when we set out to com
plete an income tax return, we are more or less obliged to resort
to the written language - a failure to complete the form on the
basis that written language was secondary would not be treated
with much sympathy.
For the purposes of this book, we assume that written and
spoken language are parallel realisations of language. This is not
to deny that, at least for the native speaker, learning to read will
normally involve a transfer from the spoken language to the writ
ten language. But we are not bound to believe that the written lan
guage, and hence reading, remains parasitical on the spoken form.
We have not, however, answered the question as to whether the
reader, when performing, is dependent on spoken language. To
some extent, linguists are not well qualified to decide on this
question. Linguistics describes language systems. If the systems
are different, then linguists are well qualified to describe the dif
ference. However, reading is a process, involving mental activity.
Linguists are not professionally equipped to make judgements
about such processes, although they are as entitled as anyone else
to venture opinions. Thus the view, derived from linguistic theory,
that a written language could exist parallel to a spoken language
as a separate realisation of the same underlying language system
simply allows us to say that such a situation could happen, that is,
that users could use written language independently of the oral
language. It does not entitle us to claim that users do actually do
this, or even that it is practically possible. Thus we are left with the
possibility that readers could be ‘translating’ into the spoken lan
guage as they read, as those who push the claims of ‘vocalisation’
have asserted.
The debate has taken place in terms of a suggested opposition
between a direct and a phonological route to word recognition
Preliminaries 27
(see Chapter 2). The reader using a direct route is envisaged as
going straight from written word to meaning, without an interven
ing ‘spoken’ stage. When using a phonological route, the reader
is seen as going from the written word to its spoken pronuncia
tion, using phonem e/graphem e correspondences, and finally to
meaning. In the pedagogical literature, the distinction can be
seen in the difference between the look and say method of initial
reading and phonic approaches.
It might at first seem easy to decide whether readers use direct
or phonological routes. One may argue that if the readers can be
shown to vocalise while reading, then they are using the phono
logical route. However, things, as often, are not so simple. Rayner
and Pollatsek (1989) distinguish between ‘subvocalization’ (mus
cular activity in the speech tract) and ‘inner speech’ (the ‘voice’
we hear in our heads while reading). They conclude from the
experimental data that subvocalisation is ‘a normal part of natural
silent reading’ (p. 192). However they point out that its function
is obscure. They are more interested in the phenom enon of inner
speech, but this is a more mysterious entity, and its function is
controversial. One might argue that since some L2 readers may
not know how to pronounce certain words, they are precluded
from phonological access. However, a little thought should show
that this is an unjustified conclusion as they may be using a non
native pronunciation.4 We return to this topic in rather more
detail in Chapter 2.
Some light may be thrown on the problem by an examination
of different writing systems, both alphabetic and other, which we
touch on next. The discussion should be of value in two other
respects: statements about reading by European authors often imply
the use of an alphabetic system of writing. We need to be warned
against such Eurocentrism. Also, the discussion should remind us
of a possibly important factor to be taken into account when we
discuss the diversity of L2 readers.
Writing systems
The account given here is based on Sampson (1985), and we use
his terminology. In particular, we use the term graph to refer to
any sign in a writing system, e.g. for alphabetic letters and for
Chinese ‘characters’.
28 Reading in a Second Language
Sampson divides scripts into two major types. In logographic
scripts, of which his main example is the Chinese writing system, a
graph relates to a meaning unit, either a word or morpheme (in
Chinese, according to Sampson, the distinction between word and
morpheme is not clear, and symbols largely refer to morphem es).
It is as if, in English a single symbol, say C, represented the w ord/
morpheme cat, and another symbol, say M, represented the w ord/
morpheme mat.5
Phonographic scripts, on the other hand, rely on analysing lan
guage at the sound level. Included under phonographic scripts
are syllabic, consonantal, alphabetic and featural scripts. In syllabic
scripts, a graph represents a syllable. In English, mat would be
spelled with one symbol, but matted would require two. Teachers
who instruct children that the sequence cat ‘spells /k a /, /a /, /to /
’ could be said to be converting English spelling into a syllabic
script. In a system of astonishing complexity, Japanese uses two
syllabaries, hiragana and katakana, to supplement a logographic
script, kanji, derived from Chinese writing.
In consonantal scripts, of which Arabic and Hebrew are two
important modern examples, the syllabic sound unit is segmented,
but only consonants are systematically represented. It is as if in
English, mftkl spelled emphatically. The system suits Semitic lan
guages such as Arabic, where vowels are largely predictable from
the grammar, but would give clear problems in English.
In alphabetic systems, such as the ones familiar to readers of
English and other European languages, both vowels and con
sonants are represented. Finally, in what Sampson terms featural
systems, the symbols represent phonological features. In English,
for example, the symbols k and g represent velar plosives distin
guished phonologically by the fact that the first is voiceless and
the second voiced. We could move English spelling in the direc
tion of a featural system if, say, we spelled the sounds / k / and / g /
with one graph, but represented the presence of voicing by un
derlining. Thus / k / would be represented by the graph k, and
/ g / by the graph k. Similarly t could represent the phoneme / 1/
and t represent /d /. Sampson gives two examples of featural sys
tems, Pitman’s shorthand, and H an’gul, devised, in a remarkable
feat of linguistic analysis, by a fifteenth-century Korean, and used
as a national script by both present Korean republics.
For the purposes of the argument here, the most significant
point is that logographic scripts do not rely on an analysis of the
Preliminaries 29
sounds of words. Thus, according to Sampson, Chinese script was
composed originally of graphs which each represented a w ord/
morpheme. At this stage, there was no generalised relationship
between graph and sound. The system was extended according to
the rebus principle, whereby a graph for a word could also be used
for another word with the same or similar pronunciation. In Eng
lish, using this system, a graph representing ‘eye’ could also be
used for the pronoun ‘I’. This does introduce a phonetic element
into the system, though not in any systematic way. The rebus
principle, however, has limitations. Chinese contains, by the stand
ards of European languages, an astonishing num ber of homo
phones (words with the same pronunciation) .6 It is as if English
had, say, twelve words all pronounced the same as ‘eye’. In order
to reduce ambiguity, the system was again extended to include
complex graphs. These are composed of at least two elements,
one of which Sampson calls a phonetic, and the other a signific
(other writers use the terms classifier, or radical, for signific). The
phonetic hints at pronunciation, the signific at semantic value. If
we try to construct an English analogy, we might think of a graph,
say M, representing the word meet Then, by the rebus principle,
this can be extended to represent both meat and the archaic adject
ive mete, which are all homophones. Our graph now represents any
of three homophones. We can reduce ambiguity by producing a
complex graph. By incorporating a signific f, meaning, say, food,
we can produce a complex graph Mf which is interpreted as mean
ing ‘sounds like meet but has to do with food’. Similarly, having
already a graph representing the word flower, say F, we can write Ff
to indicate ‘sounds like flower but has to do with food’, i.e. flour.
So far, it might look as if Chinese graphs were essentially phono
graphic, but Sampson denies this on two grounds. Firstly, the
pronunciation of Chinese has changed radically since the ‘spell
ing’ was stabilised; secondly, the sound correspondence between
the original phonetic and the secondary complex graph was not
always close. To take another example, it is as if in English we had
M (= meet) plus a signific c indicating clothing, giving us the graph
Mc for mitt. The result is, as Sampson (1985: 157) says,
A Chinese-speaker who learns to read and write essentially has
to learn the graphs case by case; both signifies and phonetics will
give him many hints and clues to help him remember, but the
information they supply is far too patchy and unreliable to enable
him to predict what the graph for a given spoken word will be,
30 Reading in a Second Language
or even which spoken word will correspond to a graph that he
encounters for the first time.
It has to be said that Sampson’s account has been vehemently
criticised by DeFrancis (1984) and by Unger and DeFrancis (1995).
They claim that 66 per cent of the graphs used in Mandarin
contain enough phonetic indicators to enable a reader to make a
good guess at the syllable (this figure being based on an analysis
carried out in 1942). They thus claim that Chinese script is basic
ally a rather odd syllabary. They back this up with the claim that
learning thousands of logograms would be equivalent to learning
thousands of telephone numbers.
To some extent the argument is academic; Sampson does not
argue that the system is purely logographic, merely that it is not
primarily phonographic. We are, however, rather dubious about
some of DeFrancis and Unger’s arguments. Any empirical experi
m ent devised to test the phonological readability of graphs would
be bafflingly difficult to set up. The subjects would have to be
literate readers of Chinese, for which a knowledge of 5000 graphs
is often suggested as minimal. On the other hand, the graphs they
were shown would have to be unfamiliar to them, otherwise the
subjects’ evidence would be hopelessly unreliable. Therefore we
are justified in doubting the statistics cited by DeFrancis and
Unger.7 As opposed to telephone numbers, Chinese graphs con
tain a certain amount of iconic information (i.e. some of them
are recognisable in picture terms), and semantic information (the
signifies). Moreover, even if the phonetic information were as
important in Chinese reading as Unger and DeFrancis say, it is
difficult to see how it would remain important when the same
graphs are used to write Japanese, a totally different language.
Sampson’s main response to earlier criticism by DeFrancis
(Sampson, 1994) is that DeFrancis fails to distinguish between the
historical origin of a linguistic system and the nature of the system
as it exists at the present.
However, we are not in a position to make a real judgem ent on
the issue. It is one of these areas in which empirical evidence is
crucial yet seems difficult to collect. We think that there is enough
force in Sampson’s case to suggest that Chinese and Japanese
readers - a large section of the world’s literate population - are
likely on many occasions to make use of direct access to lexical
items. This at least should make us cautious about making Euro
Preliminaries 31
centric generalisations on the nature of reading, based on our
experience with an alphabet. As Sampson says (1985: 146):
The European idea that from a knowledge of the pronunciation of
a word one should be able to make at least a good guess at how to
write it would seem bizarre to a Chinese.
In fact, what would appear to be the situation in Chinese reading
is often very strange, if not bizarre, to a European reader. Accord
ing to Sampson, because of the differences between the written
and spoken languages, a Chinese script can be read aloud but
may very well be incomprehensible as a spoken text.
But we can go further than this. Sampson is not the first to
argue that, for adult readers of English at least, words are treated
as logograms. It has been argued that English orthography is so
irregular in its graphem e/phonem e correspondence that it virtu
ally forces readers to use a direct lexical access route. But we
should not assume, even if the correspondence is very regular, as
in Spanish, that experienced readers will rely on decoding phono-
logically. It may well be that while this is a useful device for the
learner, it becomes increasingly discarded the more advanced the
reader becomes.
In this book we shall assume the possibility that both direct and
phonological routes may be open to a reader of English, as well as
combinations of the two.
The L2 reader
Summary
In this chapter we have tried to deal with a number of possibly
contentious issues. We have argued that reading is a language
skill, an aspect of language performance. It follows that level of
performance is one aspect of a person’s ability to use the language.
We have presented the distinction between reading as decoding
and reading as message interpretation, and come down conven
tionally enough on the side of the second view. Somewhat more
controversially, we have suggested that reading ability must go
beyond ‘pure’ language skills, and include pragmatic knowledge
and skills, whereby the readers interpret the text in terms of their
knowledge of the world. Finally, we have pointed out the similarities
between reading and listening, both of which are receptive lan
guage skills, and taken the position that reading is not necessarily
dependent on listening, but may be a parallel mode of language
reception, with implications for models of the reading process:
activity.
Notes
1. Reading Research Quarterly, XV, 2.
2. We should like to thank Ted Brandhorst of ERIC for his help in
compiling this data.
3. Some years ago The Guardian published an article on remedial read
ing. Discussing her experiences teaching a 10-year-old native speaker,
one ‘reading expert’ remarked: ‘So I taught him the word “me”.’ One
would have imagined that most native speakers know the word ‘me’ by
that age; what they don’t know is its visual representation.
Preliminaries 35
4. The eastern academic who referred in a lecture to ‘ospices deeties’
knew perfectly well what he meant. His British audience took some
time to work out that the phrase represented ‘auspicious deities’.
5. It is arguable that in a language such as English, numerals are logo
grams rather than semasiograms. That is, the word ‘three’ is in a one-
to-one relationship to the symbol ‘3’, which thus represents a linguistic
word rather than a non-linguistic concept. The fact that ‘3’ in French
represents a different word, ‘trois’, is of no relevance to English. How
ever, symbols such as ‘+’ are clearly semasiographic: the string ‘3+3’
could be translated into English as ‘three plus three’, ‘three and three’,
‘adding three and three’ and so on.
6. One of the authors was told that a Chinese student called ‘Yang’ was
due to join a forthcoming course, and that ‘yang’ in Mandarin meant
‘sheep’. When he asked another Chinese student, a university lecturer,
whether this translation was accurate, he was surprised to get the answer,
‘I don’t know’. It emerged that ‘yang’ has about 10 or 12 meanings, so
that without the appropriate Chinese graph, the particular meaning
intended cannot be decided on.
7. Sampson (1994) doubts the statistics for another reason, namely that
the appropriateness of the phonetic element to a graph’s pronuncia
tion cannot be measured statistically.
This page intentionally left blank
37
2
2.1 MODELS
Reading can clearly be viewed as a cognitive activity; it largely
takes place in the mind, and the physical manifestations of the
activity, eye movements, subvocalisation, etc., are comparatively
superficial. As a cognitive activity, reading has, since the 1960s,
been a major interest of cognitive psychologists. In fact, the huge
increase in articles cited in ERIC, which took place in the 1960s, is
probably brought about to a major extent by psychology research.
In this section we begin by looking at some of the psychological
work which we consider to be relevant to our main themes.
Cognitive psychologists who are interested in reading construct
and test hypothetical models of the reading process as it is thought
to take place in the human mind. Some of these models are in
outline familiar to many teachers of reading: the bottom-up and
top-down models have achieved some fame in the teaching world
since the 1970s, as has the later development, the interactive-
compensatory model. These will be examined in some detail be
low. The first two of these models have inspired recognizable
methodological approaches; the third has, perhaps, yet to make
its mark.
In fact, the contribution of the cognitive psychologists to L2
reading has been a major one, and would in itself be justification
for a review in this chapter. In addition, these psychologists pro
vide an admirable example of how to formulate and test empirical
hypotheses, activities which have not always been the strong point
of L2 reading. In this chapter, there is not enough space to give a
full account of the experimental methods involved - a num ber of
38 Reading in a Second Language
which will be examined in Chapter 5. We hope, however, that
we have included enough in this part to indicate how hypotheses
are tested.
When one sets out to examine a factor possibly involved in
reading performance - for example, grammatical knowledge, or
knowledge of the world - there are basically two questions to ask.
The first is whether it can be shown that the factor does have a
measurable effect on reading performance. Once that has been
established, one can then ask in what precise way the factor oper
ates. It is our impression that researchers in L2 reading have
largely been content with examining the first question, and that
for determined attempts to answer the second question we must
turn to the psychologists.
That having been said, the psychologists cannot be expected to
answer all our problems. They tend to be at their most convincing
when reporting work on what are often labelled ‘low level’ activ
ities, e.g. word recognition, and not to have a great deal of empir
ical data on ‘higher level’ activities, such as comprehension of
extended text. Then again they have their own professional inter
ests which are not likely to be identical to those of teachers and
testers of reading.
An example of the sort of explanation psychologists look for
may make this clear. Stanovich (1980), well known for his formu
lation of the ‘interactive-compensatory m odel’, reports that adults
and children faced with the task of naming a target word, were
faster when the word was preceded by an incomplete sentence
congruent with it (so that, presumably, the completed sentence
made sense). However, the children’s performance was badly
affected when the target word was preceded by a non-congruent
sentence. Adults, on the other hand, showed no difference in the
time they took in this second condition from the time when
the word was presented out of context. Stanovich suggests that
these results may be explained by a theory which postulates that
there are two processes acting, an automatic activation process
and a conscious attention mechanism. Faced with such a discussion,
the L2 researcher may be tempted to think that, whatever the
validity of the theory, it has little to offer the teacher or tester of
L2 reading. Such a reaction may be unwise, but the example
should remind us that, however grateful we are to the psycholo
gists, we must constantly assess their work for its relevance.
The theory of reading 39
Classes of reading model
Process models
Process models may be sequential, that is, they model the reading
process as a series of stages, each of which is complete before the
next stage begins. Alternatively, they are non-sequential, as in the
case of Stanovich’s interactive-compensatory model, where ‘a pat
tern is synthesised based on information provided simultaneously
from several sources’ (Stanovich, 1980: 35).
The popular view of the development of process models, which
turns up in many article introductions and innumerable PhDs,
goes roughly as follows. First of all came the bottom-up approach,
which was then replaced by the top-down model, which in turn
was replaced by interactive models. In fact, the most frequently
cited example of a bottom-up model, that of Gough, was published
40 Reading in a Second Language
in 1972, whereas the corresponding most frequently cited example
of a so-called top-down theory, that of Goodman, was first pub
lished in 1967. On the evidence of this paper, Goodman was react
ing, not against a psychological model, but against a pedagogical
approach to the teaching of initial reading. More seriously, although
Goodman is usually cited as a top-down theorist, there is a good
argument that his theory is an interactive one. Most serious of all,
while the name ‘top-down’ suggests the reverse of a bottom-up
model, no such top-down model exists, nor does it seem likely
that it could ever exist. Finally, while we are all interactive theor
ists now, the general impression given by some is that a bottom-up
approach, with some modifications, has won the day, e.g. ‘while
all reading may not be characterised as a data driven, bottom-up
process, fluent reading may best be characterised as just such a
process’ (Hoover and Tunmer, 1993: 4). It is worth keeping in
mind that, while this may well be true of word recognition, and
less certainly of syntactic processing, it is certainly not proved to
be the case for other aspects of the reading process.
Bottom-up approaches
Bottom-up analyses begin with the stimulus, i.e. the text, or bits of
the text. In Gough’s (1972) model, the reader begins with letters,
which are recognized by a s c a n n e r . The information thus gained
is passed to a d e c o d e r , which converts the string of letters into a
string of systematic phonemes. This string is then passed to a
l i b r a r i a n , where with the help of the l e x i c o n , it is recognized as
a word. The reader then fixates on the next word, and proceeds
in the same way until all the words in a sentence have been pro
cessed, at which point they proceed to a component called m e r lin ,
in which syntactic and semantic rules operate to assign a meaning
to the sentence. We should point out that this is only part of the
model. The final stage is that of the Vocal System, where the reader
utters orally what has first been accessed through print. Gough’s
model of the reading process is a model of the reading aloud
process.
We should note that in a model like Gough’s, there are two sets
of entities. First there are text units, arranged more or less in
order of size - that is, the model envisages the reader dealing with
letters, words, then sentences, in that order.1 This in itself would
The theory of reading 41
entitle it to the term ‘bottom-up\ In addition, though, there are
the processing components, in Gough’s case the scanner, decoder,
librarian, then ‘Merlin’, which are brought to bear on the text
units. More generally, such knowledge/skills components as letter
recognition, lexical access, syntactic parsing, semantic parsing, are
frequently ranked in the literature as ‘lower’ or ‘higher’ skills -
usually, it seems, because of the text components they relate to. In
Gough’s model, textual and processing components operate in
parallel, but this, as we shall see, is not absolutely necessary.
As Rayner and Pollatsek (1989) point out, Gough’s model is
explicit enough to be tested at various points, with the result that
the straightforward bottom-up direction has had to be emended.
For example, in the original model, the letters were seen as being
fed serially into the scanner for recognition. If this were true,
then a word should take longer to recognize than a single letter.
But in fact experiments have shown that this is not the case: words
can be recognized more quickly than individual letters, and even
pseudo-words can be recognized at the same speed as single let
ters. It appears, then, that at the word-recognition stage, letters
are processed in parallel. More importantly for the debate about
the direction of processing, readers have been shown to use syn
tactic information to deal with ambiguous words. And Kolers (1969)
found that bilingual readers, reading part French, part English
texts aloud, pronounced words as they would be in the ‘predom
inant text language’, e.g. ‘murs’ might be read as ‘moors’. Thus it
appears that ‘higher level’ information is being used in word re
cognition, which conflicts with the unidirectionality of the model.
It is also difficult to see how, as is claimed, one stage of the
process is over before the next stage begins. If all the words in a
sentence had to be recognised before syntactic processing began,
then the model would not appear to have any way of knowing
when to stop processing words and move to processing sentences.
With words, there seem to be few problems: not only is a word
indicated by white space on either side, but it will, if all goes well,
be present for recognition in the lexicon. With sentences, how
ever, there is no real equivalent of the lexicon, and it is hard to
believe that the reader is entirely dependent on the clues pro
vided by stops and capital letters.2 On the other hand, if words are
accessed one at a time and fed into the syntactic processor once
recognised, then recognition and syntactic processing are surely
going on at the same time.
42 Reading in a Second Language
Top-down approaches
If, with bottom-up models, it is difficult to see when to stop, with
top-down models, the difficulty is seeing where they should begin.
Bottom-up models start with the smallest text unit, either letters
or letter features. One might expect, then, that top-down models
should begin with the largest unit, the whole text. However, it is
virtually impossible to see how a reader can begin by dealing
with the text as a whole, then proceed to smaller units of the text,
say paragraphs, then down to individual sentences, ending with
single letters. In fact, the term ‘top-down’ is deceptive, appearing
to offer a neat converse to ‘bottom up’, a converse which in reality
does not exist.
In practice, the term is used to refer to approaches in which
the expectations of the reader play a crucial, even dominant, role
in the processing of the text. The reader is seen as bringing hypo
theses to bear on the text, and using the text data to confirm or
deny the hypotheses. The scope of a hypothesis varies consider
ably. In the account by Goodman (1967), possibly the best-known
name associated (perhaps wrongly) with top-down approaches,
the hypotheses relate largely to single words. In the applied lin
guistic and L2 literature, the hypothesis may relate to the whole
text, and be generated by reference to supposed schemata. Given
the somewhat misleading nature of the term ‘top-down’, we sug
gest that the related terms ‘text(or data)-driven’ and ‘reader-driven’
are more generally useful when describing the contrast between
‘bottom-up’ and ‘top-down’. In the first, the reader processes the
text word for word, accepting the author as the authority. In the
second, the reader comes to the text with a previously formed
plan, and perhaps omits chunks of the text which seem to be
irrelevant to the reader’s purpose.
Goodman is often cited as the representative of the top-down
approach, though he himself has denied the association, and it is
arguable that Frank Smith (1971, 1973) is the more appropriate
choice. As is well known, Goodman views reading as a process of
hypothesis verification, whereby the readers use selected data from
the text to confirm their guesses. Judging by the 1967 paper, it
appears that he developed his position as a reaction, not against
theorists like Gough, but against a pedagogic tradition, which
stressed a fairly strict bottom-up approach to the teaching of read
ing to young native speakers. Goodman characterizes this approach
The theory of reading 43
as viewing reading as ‘precise, sequential identification’, with the
consequence that children should be made to be more careful in
their identification first of letters, then of words. From his work
with young native speakers (in the 1967 paper, one of the subjects
is definitely LI, the other appears to be bilingual), Goodman
concluded that this view of reading was wrong, that rather than
painstakingly going from letter to letter, word to word, readers in
fact sampled the text, employing text redundancy to reduce the
amount of data needed and using their language knowledge (syn
tax and semantics) to guide their guesses. His model of reading,
then, sees the reader as (1) scanning a line of text and fixating at
a point on the line; (2) picking up graphic cues guided by con
straints set up through prior choices, his language knowledge, his
cognitive styles, and strategies he has learned (p. 270); (3) form
ing an image which is ‘partly what he sees and partly what he
expected to see’, then making a tentative choice (presumably as
to the identity of the word).
It can be seen from the description above that Goodman’s
model is top-down, to the extent that readers’ expectations are
seen as being brought to the text, i.e. the model is reader-driven.
Secondly, the reading process is seen as cyclical, the reader mov
ing from hypothesis to text to hypothesis, and so on.
The popularity and influence of Goodman’s first paper was
probably due to a number of reasons. First, it offered an altern
ative to what might be seen as the grind of moving from letter to
letter, word to word. Learning reading became a more exciting
business. Secondly, it fitted what Chomsky (1965) was saying at
the time about human language users imposing existing ‘rules’ or
expectations on ‘degenerate’ data. Finally, it meshed well with,
although probably pre-dated, notions which became commonplace
about texts always being incomplete and being completed by the
readers by referring to their background knowledge.
The importance which Goodman attributes to hypothesis for
mation and sampling has had a considerable influence on L2
reading theory: see, for example, Hosenfeld’s claim that the good
reader is a good guesser (e.g. Hosenfeld, 1984). It is also the
aspect which has turned out to be most vulnerable. One criticism
comes from studies of eye movements; Rayner and Pollatsek (1989)
point out that fixations occur on the majority of the words in a
text. While this is only indirect evidence of the process of reading,
it does not conform easily to Goodman’s claims that only part of
44 Reading in a Second Language
the text was sampled. But perhaps the most damaging criticism
concerns the claim by Goodman, Smith and other writers that
good readers guess more, and use the context more than poorer
readers. A great deal of work has shown, quite conclusively, that
while all readers use context, good readers are less dependent on
it than poor ones. In fact, it has been shown that what distin
guishes good from poor readers, at least among young populations,
is the ability of the members of the first group to decode rapidly
and accurately.
Goodman found that his young subject read words in a sup
posedly ‘difficult’ text which she failed on when she encountered
them in a more or less meaningless, phonics-focused text. At other
times, he claimed that readers read (i.e. recognized words more
accurately) when faced with the words in a real text than when
the same words were met in a list. However, when Nicholson
(1993a) tested these claims with quite large groups of children,
the results did not seem to support Goodman’s position. While
the results are not always clear cut, it seems that it was the poor
and average readers who benefited from contexts; older and better
readers seem to have been mainly affected by a practice affect, in
that they made fewer errors on the second presentation, whether it
was a list or a text (Nicholson is not clear as to how much time was
allowed between the two trials). In fact it is virtually accepted in
psychology nowadays that, at least at the level of word recognition
and lexical access, some form of bottom-up process is followed.
In spite of this, as has been said above, the assertion by some
that good readers use a bottom-up approach is only really proven
for word recognition. Nicholson, as described above, only par
tially succeeded in contradicting Goodman’s findings, and it is not
easy to see how a bottom-up approach can account for Goodman’s
original data. It is possible that his model is more appropriate for
L2 readers at certain stages of development than it is for skilled
adult LI readers. Goodman has also been more careful than some
writers in distinguishing between reception and production.
Interactive approaches
Bottom-up models are sequential, in that one stage is completed
before another is begun. In interactive models - one of which was
first credited to Rumelhart (1977) - such a regular sequence does
The theory of reading 45
not occur. As we noted above for Stanovich (perhaps the best-
known proponent of interactive models), in interactive models, a
pattern is synthesised based on information ‘provided simultaneously
from several sources’ (1980: 35). In Rumelhart’s model, once a
Feature Extraction Device has operated on the Visual Information
Store, it passes the data to a Pattern Synthesiser which receives input
from Syntactical, Semantic, Lexical and Orthographic Knowledge,
all potentially operating at the same point. If one takes Stanovich’s
description as defining interactive models, then Goodman’s is one
such, since, according to him, ‘Readers utilize not one, but three
kinds of information simultaneously’ (Goodman, 1967: 266). The
information is orthographic, syntactic, and semantic.3
Interactive-compensatory approaches
Stanovich calls his model an ‘interactive-compensatory’ one. The
compensatory refers to the idea, intuitively appealing, that a weak
ness in one area of knowledge or skill, say in Orthographic Know
ledge, can be compensated for by strength in another area, say
Syntactical Knowledge. At the risk of labouring a point, we might
claim that Goodman’s account contains this notion, since he refers
to weaknesses in the orthographic area being made up for by the
‘strong syntax’ of a real text, meaningful to the young reader. The
notion of compensation has been alluded to in research in L2
reading, for example in Alderson and Urquhart (1985), where it
was hypothesised that background knowledge might make up for
inadequate language skills.
Interactive-compensatory models are very attractive and have
received a great deal of support. Their main weakness, from the
experimental point of view, is that, as Rayner and Pollatsek (1989:
471) point out, they are very good at explaining results but com
paratively poor at predicting them in advance. To some extent
this is because each reader must be viewed as potentially different,
with different strengths and weaknesses. Hence two readers may
on one occasion arrive at the same level of performance by utilis
ing different strengths. But this situation, while exasperating for
the model-building psychologist, may simply reflect a widely per
taining reality.
Before we leave these models, which all to some extent attempt
to mirror the actual process of reading, we shall make some points
about such models in general. First, one consequence of the advent
46 Reading in a Second Language
of interactive models is that an almost infinite variety of models
might seem possible, since one can have all sorts of variations of
interactive top-down and interactive bottom-up models. Thus
Rayner and Pollatsek consider Just and Carpenter’s model to be
basically bottom-up, with interactive elements, while they concede
that Goodman’s model, while basically top-down, also might be
said to have interactive aspects.
Secondly, while there does seem to have been a swing towards
bottom-up models (see the remark by Hoover and Tunm er earl
ier) , it should be stressed that the empirical evidence in favour of
such models is strongest only in the area of word recognition and
lexical access. Beyond that stage, there is comparatively little agree
ment, so all sorts of model may be possible.
Thirdly, the psychologists take as a given what is sometimes
referred to as ‘normal reading’; Rayner and Pollatsek narrow this
down to the careful reading of textbooks. But while such a posi
tion may be convenient for experimenters, it is too narrowly
defined to be acceptable to those interested in the whole range
of reading activities. Once we include other kinds of reading as
legitimate, we may then be tempted to take the view that different
tasks may require different types of reading and different models
of the processes involved. Thus it might seem reasonable to sug
gest that search reading (see below) is largely reader driven, while
the careful reading of new material is likely to be predominantly
text driven. And investigations by educational psychologists (e.g.
Entwhistle et al., 1979) suggest that either text driven or reader
driven may be the preferred styles of particular classes of reader
(see Section 2.3). While appreciating the cognitive psychologists’
attempts to equip themselves with operational definitions in order
to make testable predictions, we must always keep in mind the
sheer complexity of the activities grouped under the term ‘read
ing’. Thus Gibson and Levin (1974) deny the possibility of having
models of the reading process, precisely because of different styles
and different responses to different reading tasks.
Componential models
The models we have looked at above attempt to describe the
actual process of reading, a cognitive activity operating in real
time. In fact, Rayner and Pollatsek are critical of some models,
The theory of reading 47
including those of Goodman and, to a lesser extent, that of
Rumelhart, in that they are insufficiently explicit about the pro
cess, and hence, in their terms, are not really models at all. The
componential descriptions we now look at do not even begin to
model the process, consisting as they do simply of areas of skills or
knowledge thought to be involved in the process. According to
Hoover and Tunmer (1993), such descriptions try to model read
ing ability rather than the reading process. The use of such
componential models, again according to Hoover and Tunmer,
is ‘to understand reading as a set of theoretically distinct and
empirically isolable constituents’ (1993: 4). Thus one should be
able to account for different reading performance in terms of
variation in one of the components.
Word recognition
We should start by saying that even the definition of the meaning
of the term ‘word recognition’ is disputed. Hoover and Tunmer
(1993) mention three interpretations. The most obvious one would
be to have the term mean ‘recognize an English word in print, be
able to pronounce it, and give its meaning’. Some people would
dispute the need to include pronunciation (the phonological part).
Many of the experiments on word recognition have involved the
use of pseudo-words, such as ‘m ard’. So we might have to extend
the term to mean ‘recognition of pronounceable strings of letters
which are not actual words in English’. But some of the experi
ments involve recognition of ‘unpronounceable’ pseudo-words.
So we might have to add ‘recognition of any letter string with space
boundaries on either side’.
Then the processes of word recognition appear to be extremely
complicated, and not well understood. So we will content our
selves here with stating what is generally agreed, what facts any
theory has to account for, and the immediate implications for a
reader of a foreign language. In our account we rely mainly on
Rayner and Pollatsek (1989), keeping in mind that most of what
they have to say concerns skilled adult readers.
52 Reading in a Second Language
First, as we reported earlier, letters are not processed serially. If
they were, then the time taken to recognize a word would be
longer than the time needed to recognise a single letter, and the
longer a word, the longer it would take to recognise. Within reason,
this does not seem to be the case. Furthermore, subjects are more
accurate in reporting letters in words, e.g. the ‘d ’ in ‘word’, than
they were in recognizing ‘d ’ by itself.
On the other hand, it cannot be the case that words are recog
nised as templates or pictures. In the experiments mentioned
above subjects were as quick to report letters in pseudo-words like
‘orwd’ as they were to report the single letter, although it is incon
ceivable that they had a template for such pseudo-words. And as
Rayner and Pollatsek point out, the notion of templates cannot
account for subjects’ ability to read in a number of different fonts.
So Rayner and Pollatsek settle on a modified version of a model
by Paap et al. (1982), in which the visual input goes first to f e a t u r e
DETECTORS, then to LETTER DETECTORS, and finally to WORD DETEC
TORS. Since on their own admission the evidence for the viability
of the model is too complex to present in their book, we shall
content ourselves with mentioning it.
Given that in normal reading the purpose of word recognition
is to access the lexicon, it is generally recognised that there are
two routes to this. The first, known as the direct route, goes straight
from the visual input to meaning without recourse to sound; the
other, known as the phonemic or phonological route, goes from
visual input to sound to meaning. Evidence that readers of Eng
lish use a direct route comes from consideration of the writing
system; the system notoriously contains too many irregularities to
allow total reliance on the phonological route. However, use of
the direct route alone cannot explain subjects’ ability to handle
pseudo-words like ‘m and’, or ‘birn’, which are not likely to be
present, and hence accessible, in the reader’s lexicon. Other evid
ence for a phonological route comes from phonological influence
on word recognition: recognition of words like ‘touch’ have been
shown to be slowed down when they are preceded by, in this case,
the word ‘couch’. Also recognition of pseudo-homophones (non
words which share the same pronunciation as real words, e.g.
‘phocks’) have been shown to be slower than recognition of other
pseudo-words. Here, presumably, the activation of the pronunci
ation collides with the real word in the lexical entry, thus causing
confusion.
The theory of reading 53
Finally, Rayner and Pollatsek cite as evidence for two routes the
fact that so-called surface dyslexics can pronounce most words,
but regularize irregular ones, while phonemic dyslexics pronounce
most words correcdy but cannot pronounce non-words. This can
be explained by arguing that the first group are limited to the
phonological route, and the second group to the direct route.
Thus Rayner and Pollatsek (1989: 109) conclude that ‘the com
mon ground for all positions is that direct visual access is import
ant and that sound encoding plays some part’.
Two other aspects of word recognition seem to be relevant
here. Most of the research appears to have been done with single
morpheme words, such as ‘touch’ and ‘word’. It seems, to say the
least, somewhat odd to suggest that there is a separate lexical
entry for, say, not only ‘expect’ but also ‘expects’, ‘expected’,
‘expecting’, ‘expectation’, ‘unexpected’, etc. Rayner and Pollatsek
report evidence that, in such cases, the root morpheme is accessed
first, together with some evidence that ‘content’ words are stored
separately from ‘function’ words, so that, for example, ‘expect’
would appear in a different lexicon from, say, ‘the’. Function words
would seem, anyway, to pose problems for the ‘phonological route
only’ school: it seems, frankly, incredible that a skilled reader,
reading fast, will distinguish between different phonologically
conditioned pronunciations of ‘the’.
Word recognition in L2
Randall and Meara (1988) remark that most L2 reading research
‘has been centred on the relatively higher-order skills of discourse
organization and the interpretation of continuous text’, and say
that this is ‘for obvious reasons’. Perhaps one obvious reason is
that many of the potential subjects are presumed to be past the
‘simple’ stage of word recognition by the time they become avail
able to researchers, though such an assumption is by no means
certain. A further reason is that many L2 researchers have a train
ing in applied linguistics, which has tended to ignore this area.
Whatever the reasons for the neglect, it is clear that word re
cognition poses intriguing problems for L2 reading researchers.
Take, for example, the question of phonological access to the
lexicon, which presupposes that for a word to be accessed in read
ing, the lexical entry must contain a phonological component, i.e.
it must contain information as to how the word is pronounced. In
54 Reading in a Second Language
most cases, skilled adult LI readers can be assumed to have this
information, while most difficulties foF young LI learners can be
avoided by careful control of vocabulary. But these assumptions
cannot be made with respect to many L2 readers; in a huge number
of cases they are going to come across words which they have not
heard pronounced. Does this mean they are unable to access them?
Then there is the question of the script with which the L2 learners
are familiar when they begin to read the L2. If the learners come
from a different orthographic tradition, is this likely to affect their
reading in the L2?
Language
It has already been pointed out that the distinction between lan
guage and word recognition, drawn by Hoover and Tunmer, is
not a clear one, since the lexicon is clearly part of our linguistic
competence. However, we are keeping the distinction here for
convenience. We need, however, to break the topic down into
more manageable subtopics. Hence, below, we discuss syntax, then
cohesion, and larger aspects of text structure.
Inner speech
Before we consider syntax, we should look briefly at the notion of
inner speech. This is the Voice in the head’ which many of us are
aware of while we are reading. Rayner and Pollatsek (1989), who
devote considerable space to the issue, try to distinguish inner
speech from subvocalisation. The latter, which involves actual phys
ical activity in the speech tract, used to attract the attention of
reading teachers, who taught that it should be suppressed. How
ever, electromyographic recording shows that subvocalisation is a
normal part of silent reading.
Inner speech is more interesting. There is considerable evid
ence that the sounds of words influence the speed or accuracy
of silent reading. Rayner and Pollatsek report that readers find
strings like
Crude rude fude stewed food
difficult to read silently. They suggest that the effect of inner
speech is post-lexical, i.e. occurs after lexical access, and that its
function is to hold material in the working memory until it is
processed. Perfetti and McCutchen (1982) claim that inner speech
is not a complete representation of every word in the text, but is
biased towards the beginning of words. They suggest that function
words may not require as elaborate a phonetic representation as
content words.
58 Reading in a Second Language
As far as we know, nothing is known about the effect of the
presence or absence or form of inner speech in L2 readers. Koda’s
experiment, reported above, was directed towards this area, but
produced no convincing results. Rayner and Pollatsek (p. 211)
report that inner speech ‘may be somewhat less important in Chi
nese than in English’. With reference to deaf LI readers, they
report that ‘the comprehension and memory advantages provided
by one’s primary language weigh heavily in the choice of a recoding
system’ (p. 210). One would speculate that if the LI reader of
English relied on a phonological rendering of the message to
assist in processing syntactic units, then the L2 reader is likely to
be doubly handicapped, being uncertain of both the syntax and
the phonology. It is just possible that the finding by Dhaif (1990)
that Arab students’ comprehension of written English was signi
ficantly improved by the teacher reading aloud while they read
silently in parallel has some relationship to inner speech.
Syntax
In addition to words being recognised, the significance of the
relationships between them (e.g. the syntax) needs to be extracted
by the reader. It would be reasonable that, given the vast amount
of work which has been done in linguistics in the area of syntax,
we would be well informed as to how readers operate. This, how
ever, is not the case (the reader may remember that the syntactic
and semantic component in Gough’s model was called ‘Merlin’).
Rayner and Pollatsek mention a number of approaches, none of
which seems to have attracted anything like the attention in psy
chology as have problems of word recognition or eye movements.
They single out for special mention two approaches. The first,
they refer to as the ‘Clausal’ model of processing, developed in
the 1970s. This ‘m odel’ consisted of a number of pragmatic strat
egies, e.g. ‘take the first clause to be the main clause unless there
is a subordinating conjunction’. Rayner and Pollatsek are fairly
dismissive of this approach, claiming that such pragmatic rules
would form ‘an unsatisfactory hodgepodge’ (p. 246). However, it
seems to us that L2 readers may well build up such a set of strat
egies, partly derived from their LI, partly constructed to deal
specifically with the L2.
The approach Rayner and Pollatsek favour is the so-called ‘gar
den path’ approach. This contains two main principles. According
The theory of reading 59
to the first, known as ‘minimal attachment’, the reader structures
data to try to minimize .-the number of grammatical nodes re
quired. Thus, a sentence like ‘The girl knew the answer by heart’
is likely to cause fewer processing difficulties than the sentence
‘The girl knew the answer was wrong’ since the latter, in terms of
Phrase Structure Grammar, requires a subordinate sentence node
not required for the former sentence. The second principle, known
as ‘late closure’, claims that, when grammatically possible, readers
will attach new items to preceding items rather than subsequent
ones. Hence, given the two sentences
Since Jay always jogs a mile this seems like a short distance to him.
Since Jay always jogs a mile seems like a long distance to him.
readers are predicted to have fewer problems with the first, since
in both cases they will initially attach ‘a mile’ to ‘j ogs’ - a move
which works with the first sentence but has to be revised in the
case of the second.
Rayner and Pollatsek claim that experiments using observation
of readers’ eye movements support the existence of both prin
ciples. However, on the examples they produce, it is not clear to
us that there are, in fact, two principles involved. In both sets of
sentences quoted above, the readers’ difficulties might be attrib
uted to taking ‘knew’ and ‘j ogs’ as transitive verbs requiring an
NP object, then assuming that the first possible NP, ‘the answer’
and ‘a mile’, completes the Verb Phrase; in other words, a version
of ‘late closure’. In fact, one is tempted to agree with Ridgway
(1997) that an approach based on ‘dependency’ grammar would
be fruitful. Such grammars, however, have the disadvantage of
being relatively underdeveloped.
Syntax in L2 reading
When we turn to grammar in L2 reading, we find again a dearth
of data. There are probably at least two factors involved here.
Some years ago it was not uncommon to find EFL books contain
ing ‘reading passages’ which seemed to have been included mainly
to supply fodder for grammar teaching. If one considers written
text in this way, then it is not likely that one will investigate
the effect of one on the other. After this, the ‘communicative’
approach tended to stress language use, and hence disparage
60 Reading in a Second Language
attention being paid to ‘knowledge’ areas such as syntax. Finally,
as commented on by Randall and Meara (1988), a concentration
on ‘high-level’ factors such as background knowledge, skills and
strategies, led to the comparative neglect of lower-level factors
such as syntax.
We have been taking the conventional position that syntactic
parsing of some kind was necessary in order to impose meaning
on the words recognised. This apparently commonsensical position
has been contradicted by findings of Ulijn and his associates. In
Ulijn and Kempen (1976), Dutch and French speakers read a text
about finding their way around an imaginary town, Beausite. The
text, which was in French, existed in two versions. In one version,
French syntactic structures not found in Dutch were included. In
the other, such structures were avoided. However, there was no
difference in either the Dutch or the French readers’ responses
to the text. Ulijn and Kempen conclude that: ‘under normal con
ditions reading comprehension is little dependent on a syntactic
analysis of the text’s sentences.’
In later experiments (Strother and Ulijn, 1987) students from
different linguistic backgrounds - English and others - read a text
on an aspect of computer science. One version of the text was the
original; in the other, ten ‘passages’ (i.e. sentences) had been
‘simplified’ in certain specific syntactic ways, e.g. passives were
replaced by active equivalents, nominalisations by expanded N oun-
Verb constructions. Again no significant differences were found
between responses to the original and simplified text. Strother
and Ulijn conclude that readers use a ‘conceptual strategy’, con
sisting largely of knowledge of word meanings together with know
ledge of the text’s subject area. Thus, in the model, a syntactic
element could be eliminated.
As far as simplification of text for L2 readers is concerned,
there may well be a case for an emphasis on lexis, as Strother and
Ulijn argue, though whether results based on the ‘simplification’
of ten sentences of a text of unspecified length is good evidence
for this is debatable. To claim, however, as Ulijn seems to do at
times, that syntactic processing is not necessary, is frankly unbe
lievable. This is easily demonstrated. The following string rep
resents an English sentence from which most (not all) function
words and all inflectional morphemes have been deleted. More
over, since ordering plays a major part in English syntax, the
order of the remaining words has been jumbled.
The theory of reading 61
begin several it recogniser module machine digital pass record speech
We challenge anyone, whether expert in the content area (arti
ficial language) or not, to process this string. Things begin to be a
bit better if we restore the original ordering:
Machine begin digital record speech pass it several recogniser module
However, it is only when we restore function words and inflec
tions that the message becomes easy to extract:
The machine begins by digitally recording the speech and passing it to
several recogniser modules.
The subjects used by Ulijn and his associates were comparatively
expert in the L2: for example, the Chinese students used by
Strother had TOEFL scores of 550+ and had been in the USA for
nine months. Ulijn’s Dutch students had studied English at sec
ondary school for six years and had had ‘considerable exposure to
English’ ( Ulijn and Kempen, 1976: 94). It can be assumed, there
fore, that the subjects’ syntax was sufficient to cope with whatever
was given them. It might, of course, be reasonably concluded that,
at their level, extra syntactic tuition would give smaller returns
than an emphasis on vocabulary building. But that is a very dif
ferent thing from claiming that ‘reading comprehension is litde
dependent on a syntactic analysis’.
One point of interest that can be retrieved from the work
of Ulijn and his associates can be found in the remark that ‘a
thorough syntactic analysis is unnecessary’ (our italics). It has
sometimes been claimed that the amount of syntactic knowledge
necessary for reading is less than that required for writing or
speaking. Thus, given that the readers had enough background
knowledge, they might make quite reasonable sense of, say,
machine begin by digitally record speech and pass it to several recogniser
module.
In other words, a successful processing of this text might not
depend on a detailed knowledge of the determiner system, mor
phological marking of plurals, etc. Thus it might be possible to
distinguish between a receptive and a productive syntactic pro
cessor (see Section 5.2).
Alderson (1993) has produced evidence of a strong connec
tion between grammar and reading. During the preparation of
the IELTS test, item writers were instructed to produce a test of
62 Reading in a Second Language
grammar which could be used along with tests of reading, listen
ing and writing in the total test. After tests had been trialled in
both the UK and Australia, it was found that very high correla
tions held between the grammar test and different tests of read
ing. For example, the correlation between the grammar test and
the science and technology reading test was 0.80. This was in spite
of the fact that the grammar test was designed as a test of gram
mar in general, rather than of structures found in the reading
tests, and that the reading tests did not include any specifically
grammatical item.
While we have some doubts about details of the tests used (see
Part 5), there seems little doubt that Alderson is correct in con
cluding that ‘it must be the case that, in some intuitive sense, a
reader must process the grammar in a text in order to understand
it’, and that ‘. . . the evidence certainly does not support any claim
that one can successfully understand text without grammatical
abilities’ (p. 219).
However, this is more or less where the case rests. We don’t
know how L2 readers process texts syntactically, though, as men
tioned above, we may suspect that they apply a collection of prag
matic strategies, e.g. in English, the first NP is likely to be the
Subject. These strategies are likely to be influenced by their experi
ences with reading in their LI, as Cowan (1976) has posited. It is
quite likely that more breakdowns occur in processing than are
obvious on the surface; one of the authors discovered that Indo
nesian students seemed not to be able to assign Subject or Object
roles to nouns in relative clauses. But little is known of this area.
Background knowledge
Both Coady’s and Bernhardt’s models contain a component called
‘background knowledge’ (in Coady’s model) or ‘world knowledge’
(in Bernhardt’s model). Hoover and Tunmer also mention back
ground knowledge but only to attempt to exclude it. They are
interested in reading ability, rather than reading performance;
presumably also one’s background knowledge can be assumed to
be constant whether one is reading or listening, and therefore
cannot be used to distinguish between the two activities. In L2
reading, however, and in particular in the area of LSP, the con
cern is more to predict performance on particular reading tasks,
The theory of reading 63
and for this, the background or world knowledge of L2 readers
may well need to be taken into account.
The theoretical justification for including background know
ledge as a component of our reading model can be seen as deriv
ing from two different sources. First, it is part of the theory of
comprehension associated with the notion of ‘schemata’ (see below)
that text is never complete, and that the reader (or listener) must
supply additional material derived from their existing knowledge
of the world. From this point of view, background knowledge
is inevitably present in all kinds of reading, both LI and L2.
The second source is interactive models of the reading process.
Although not by any means constructed with L2 readers in mind,
such theories are good at predicting that L2 readers, with signific
ant defects in their knowledge of the language, may sometimes
perform as well as LI readers. The theory will predict that, assum
ing they have the required background knowledge, L2 readers
may use this knowledge to compensate for linguistic shortcomings.
The possibility of such an outcome is of practical importance in
deciding, for example, whether an L2 student is capable of pro
ceeding to an academic course of study involving reading in their
own speciality. Thus background knowledge has been of particu
lar interest to those involved in testing and teaching LSP (see
Chapter 3).
There is a considerable amount of experimental evidence in
L2 reading that background knowledge can play the part envi
sioned for it in the theory. Bernhardt (1991b) gives an extensive
list of studies, to which we refer the reader. The majority of studies
she cites were successful in showing that readers’ familiarity with
content had a significant effect on their performance. However,
in a number of cases no such effect has been found (e.g. Clapham,
1990, who found that ‘subject area had no significant effect on
scores’). Because of this, we shall focus on the conditions required
before the effect of background knowledge becomes evident.
We shall begin by examining two studies - one by Mohammed
and Swales (1984) and one by Alderson and Urquhart (1985) -
which will serve to illustrate some of the relevant factors. Both
these studies belong to the same group in Bernhardt’s classifica
tion, being concerned with background knowledge of topic rather
than cultural background. However, if we accept the claim by
Widdowson (1978), that science constitutes a culture, this division
becomes somewhat arbitrary.
64 Reading in a Second Language
Mohammed and Swales gave twelve postgraduate students the
tasks of using an instructional pamphlet to (a) set the current
time on a digital clock and (b) set the alarm for a specific time the
next day. Performance was measured in time required to accom
plish the tasks. The subjects were categorized as (a) Native speaker
scientists (NS), (b) Native speaker arts (NA), (c) Non-native speaker
scientists (NNS), (d) Non-native speaker arts (NNA). The linguistic
proficiency of the non-native speakers was arrived at using teachers’
estimates of the subjects’ potential band scores on the IELTS test:
General and Reading modules.
The subjects were video-taped during the tasks, to investigate
their overall behaviour (reading different parts of the instructions,
manipulation of controls, etc.). While this record is irrelevant
here, such data are clearly important for studies of different read
ing behaviours and different reading models.
The main measure of performance was the time subjects took
to complete the tasks. They did so in the following order of pro
ficiency: (1) NS, (2) NNS, (3) NA, (4) NNA. Thus the non-native
speaker scientists performed better than the native speaker arts
subjects, in spite of the fact that their average band score was 5.4,
while the native speakers were assumed to have a band score level
of 9. Moreover, the NNS group also outperformed the NNA group,
in spite of the fact that the latter group had an average band
score of 7.9. Mohammed and Swales (1984) ascribe the difference
between groups to ‘either field-familiarity or, more likely, familiarity
with the genre of technical instructions’ (p. 211), and express
surprise at the strength of the influence of technical experience,
and the ‘apparent unimportance of general English proficiency
above a presumed threshold level’ (p. 216).
Alderson and Urquhart (1984) carried out a series of three
studies using subject-related groups of L2 postgraduate students,
namely Engineers (ENG), Science and Maths (SM), Development
Administration and Finance/Economics (DAFE) and Liberal Arts
(LA). In Studies 1 and 2 there were three groups of texts, aimed
primarily at the ENG, DAFE and LA groups. The tasks were gap-
filling (Studies 1 and 2), and gap-filling plus short-form answers
(Study 2). In Study 3, three modules of the ELTS test, Technology,
Social Sciences, and General Academic were used.
The results were inconsistent. In Study 1, ENG outperformed
DAFE on the engineering texts, as predicted, while DAFE outper
formed ENG on the DAFE texts. However, in Study 2, while the
DAFE group outperformed ENG on the DAFE texts, there was no
The theory of reading 65
difference between the two groups on the ENG texts. In Study 3,
in contrast, the SM-ENG group (ENG and SM combined) outper
formed the DAEE group on the Technology module, while on the
Social Science module, the two groups did not differ significantly.
Alderson and Urquhart concluded that in the case of the Engin
eering texts, the background knowledge of the engineering stu
dents was compensating for their comparative low level of language
proficiency.
Thus while the studies provided evidence of an effect of back
ground knowledge, this effect was not consistent throughout. There
was also evidence of a factor related to language proficiency (the
LA groups, which were all through more proficient on measure
ments of proficiency than the other groups, in virtually all the
tests either equalled or surpassed the other groups). There was in
addition evidence of both text effect (some texts proving consist
ently easier than others, though on a Fog Readability Index they
were equivalent) and of method effects.
In spite of such inconclusive evidence, it seems to us undeniable
that background knowledge has an effect on reading. While this is
probably true for all texts, it is most easily comprehended in rela
tion to what, for some people, are highly specialised texts. Given
such a text on nuclear physics, for example, taken from a profes
sional journal, it seems undeniable that a professional physicist
will read it differently from most of the readers of this book (as
well as the authors). We can state this in relativistic terms, and say,
as do Harri-Augstein and Thomas (1984), that our comprehensions
will be different, or we can be more absolute and claim that the
physicist’s reading is likely to be better. We think most people, after
comparatively little reflection, will be inclined to agree with this.
If, however, we accept that background knowledge is involved
in all normal reading, then we are obliged to account for the fact
that studies have not always been able to detect a significance dif
ference brought on by apparent differences in knowledge brought
to the task by experimental subjects. There are at least three factors
to be discussed, and we shall now discuss this factor by factor,
referring to the two studies described above.
Texts
If our aim is to show, as it was in Alderson and Urquhart, that two
or more groups of readers will perform differently as a result of
66 Reading in a Second Language
differences in background knowledge, then it seems obvious that
the texts used should be as specialised as possible. Clearly, a text
which is equally accessible to both groups in terms of the know
ledge required will not show any difference between the groups.
Clapham’s remark that it is only with highly specific texts that
background knowledge has an effect on student test performance
seems almost too obvious to make (Clapham, 1996a). In the first
two studies by Alderson and Urquhart, the Engineering texts were
chosen with the help and advice of an academic engineer. Even
so, on the whole they failed to discriminate between engineers, on
the one hand, and science and maths students on the other, pre
sumably because of being insufficiently specialised in the direc
tion of the engineers. If one uses parts of existing tests aimed at a
wide range of testees, as was the case in Study S of Alderson and
Urquhart, and also in Clapham (1990), then there is a danger of
the texts being insufficiently specialised, having been filtered by
test constructors and editing committees. This was certainly the
case for the IELTS tests used by Clapham. The focus of the re
search then shifts to the question of whether, in cases like IELTS,
having ‘specialised’ ESP modules is justified. This is a worthwhile
question to ask, but tells us little or nothing about the effect of
background knowledge.
In fact, some of the conflicting findings in the literature may
be traceable back to a difference in focus on the part of the
researchers, leading to a difference in text selection. In the first
two studies by Alderson and Urquhart, the focus was on possible
differences caused by differing background knowledge. In their
third study, and in some at least of Clapham’s work, the focus
changes to whether the texts (and possibly tasks) of existing ESP
tests are successful in discriminating between different groups,
and are therefore worth using. It is perfectly possible to consider
that background knowledge is an important factor in reading
performance while at the same time being of the opinion that
broad-based ESP tests are probably not worth having.
In view of what has been said above, the text used in Moham
med and Swales - instructions for setting a digital clock - seems
slightly anomalous. After all, it was not presumably aimed at
a specialised audience: digital clocks are widely used by non
scientists. Presumably we have here a case of a sender of a mes
sage who has not taken sufficient account of the skills of their
audience.
The theory of reading 67
Subjects
It follows from what has been said above that if we are to find
significant differences between groups, the groups should be as
different as possible from each other in terms of the relevant
knowledge each group possesses. This is fine in theory but often
difficult in practice. Suppose we try to use two groups, one of post
graduate management students, the other of engineers. One often
finds (generally after the experiment) that some members of the
management group have prior training as engineers. Mohammed
and Swales operated with two groups, scientists and arts students.
Yet the distinction is a very rough and ready one. Many linguists
would wish to describe themselves as scientists, but Mohammed
and Swales classified students of applied linguistics as arts students.
Tasks
The test tasks used by Alderson and Urquhart in Studies 1 and 2
were of the form of gap-filling and short-form answers. Such tasks
have the advantage that they are easily designed and administered.
They have the disadvantage that they are not particularly appro
priate in terms of reading either to the texts or to the readers.
Hence, if no difference is found, suspicion may fall on the tasks.
Bernhardt (1991b), for example, has suggested that the frequent
use of cloze procedure may be a factor in obscuring the effect of
background knowledge. An alternative is to use tasks which are
functionally appropriate both to the text and to the readers. Pre
ferably, they should be composed by members of the discourse
community which uses the text. That is, if the text relates to, say,
architecture, then the task should ideally be devised by architects.
Without having to go through all the difficulties this is likely to
entail, Mohammed and Swales score highly in this respect, since,
as they point out, the appropriate task relating to instructions is
for the readers to carry out these instructions. Otherwise what we
are likely to get is either cloze tasks, which Bernhardt has charac
terised as ‘a syntactic/productive measure of clausal knowledge’,
or ‘comprehension’ tasks devised by item writers with a training in
EFL or applied linguistics, which may be fine for describing what
a typical EFL teacher may extract from a text, but hardly suitable
to map out what a specialist may learn from it.
68 Reading in a Second Language
Language level
Many of the studies in this area refer to the language proficiency
level of the students. In Alderson and Urquhart’s studies, for
example, though a language factor was not built into the design
of the experiments, it was noted that, according to earlier pro
ficiency tests, liberal arts students were more proficient than
development and finance students, who were in turn superior to
the engineers. In Mohammed and Swales’ design, language (or
general reading proficiency) was integrated more closely into the
design, though the m ethod of ascertaining it, by asking teachers
to estimate ELTS reading scores, was to say the least subjective.
Ridgway (1997) argues that the level of language proficiency is
crucial, and differences in level may have masked the background
knowledge effect in some cases. Certainly, Alderson and Urquhart’s
LA group equalled the engineers on engineering texts in two out
of three studies, presumably because of their higher language
proficiency. Ridgway, like Mohammed and Swales, argues for a
threshold linguistic level, below which any relevant background
knowledge cannot be brought into play, and this seems reason
able (see below for threshold effects). We have less sympathy with
his claim that there is also an upper threshold level beyond which
the readers’ language ability is sufficient to allow them to read any
text with equal success. This, we feel, runs counter to our experi
ence of subject-specialised texts.
Schema theory
Finally, we should touch on the vexed issue of background know
ledge itself and what it consists of. Fairclough (1995) has criticised
some discourse analysis because of the assumption that back
ground presuppositions are ‘neutral’, or in accord with some kind
of objective reality. Instead, he points out that such presupposi
tions may represent the views of ideological groups. While he
seems to us to make a good case for his position, the criticism
cannot be applied in much of the area we are discussing. This is
not because advocates of the role of background knowledge are
not vulnerable to criticisms of the type Fairclough raises. It is
rather that background knowledge is often not specified in suffi
cient detail to enable the presuppositions to be examined.
The theory of reading 69
Carrell (1983b) distinguishes between formal and content sche
mata, i.e. knowledge about (a) the rhetorical structure of texts
and (b) the content Both have been shown to have an effect at
times on reading performance. Mohammed and Swales (1984),
for example, are inclined to attribute their results to ‘familiarity
with the genre of technical instructions’ (p. 211), more or less
equivalent to Carrell’s formal schemata. We, however, prefer to
discuss this aspect of background knowledge under the heading
of ‘Literacy’ (below). Bernhardt, as we mentioned above, divides
studies into those concerned with cultural knowledge, subject-
specific content, and information supplied to readers shortly
before reading. In terms of the individual, we see no harm in
grouping the first two types of knowledge together. We suspect,
however, that information supplied to readers shortly before they
read a text is likely to play a different part in reading from that
played by longer established knowledge. We would like, then, to
consider this under teaching methodology, and concentrate on
well-established knowledge.
There are two related problems here: (a) to define in some way
what it means to say, for example, that someone ‘knows’ chem
istry; and (b) to test the person’s knowledge. Clearly the answer to
(a) must be more than just a collection of facts: it must include
relationships between ‘facts’, some idea of the purpose of the
pursuit, possibly of the history of the subject, and future applica
tions. Equally obviously it cannot be determined in terms of vocabu
lary. As far as (b) is concerned, some form of test, such as the
free-association tests used by Langer (1984), might be considered.
However, while tests might successfully establish whether or not
a reader was already well informed about limited topics such
as cricket or baseball, it is difficult to imagine an easily adminis
tered test of an adult’s knowledge of, say, production engineering.
Given this problem, one tends to fall back on the sometimes naive
assumption that if readers have already completed several years’
study of an academic subject, then they will possess a store of
background knowledge about it. This is the assumption made by
Alderson and Urquhart. Given the uncertainty of a lack of ability
to define the crucial variable, it is hardly surprising that some
experiments fail to come up with positive results.
In spite of all this, however, there is enough evidence in the
literature to support the theory that background knowledge plays
a crucial part in the reading process.
70 Reading in a Second Language
Studies of the effect of background knowledge, when they find
a positive effect, provide evidence that such knowledge can legit
imately be considered a component of reading. They tell us noth
ing of the process, i.e. what is going on to produce this effect. For
any sort of answer to this, we are usually referred to schema theory.
This theory has been extensively described in the reading liter
ature; in fact it sometimes seems to be obligatory for anyone writ
ing a thesis on reading to begin with a lengthy description of the
theory, beginning with Kant (1781), moving to Bartlett (1932),
then to Rumelhart (1980). Bartlett found that English subjects,
given a North American folk tale, were unable to comprehend or
remember parts of it. This was in spite of the fact that none of the
words or sentences was linguistically unfamiliar or senseless. It
appeared that for comprehension and remembering to take place,
the linguistic input needed to match existing mental configura
tions or concepts. Input which did not match the configuration
was not remembered, even though it presented no actual lin
guistic difficulty.
While the notion of such configurations, or schemata, seems
very attractive, there are huge problems attached. Sadoski et al.
(1991: 466) quote Bartlett as saying:
I strongly dislike the term ‘schema.’ It is at once too definite and
too sketchy.
Below are some reasons for believing that schemata are not very
useful in reading research (or possibly, by the ease with which
they can be invoked in any number of situations, too useful):
1. Schemata are often described as being ‘structures’ or ‘tem
plates’, and are often seen as being hierarchical (e.g. Collins
and Quillian, 1969). Rumelhart (1980), on the other hand, sees
schemata as being fluid and constantly capable of adapting to
fresh information. Bartlett, also, in the excerpt referred to by
Sadoski et al., refers to the need to invoke ‘active, developing
patterns’. But a constantly changing template is n o t likely to be
a very useful instrument. In fact, the need for schemata to be
structured in advance, yet adaptable to text-driven alterations,
has been a problem for schema theorists from the beginning.
2. It has been argued that the term ‘schema’, as commonly used,
is virtually synonymous with ‘background knowledge’, and hence
is useless (cf. Sadoski et al., 1991).
The theory of reading 71
3. Related to this is the odd fact that, at least in the L2 research
literature, while schemata are frequently appealed to, they are
seldom described in any detail. Compare the more rigorous
experimental investigations of prototype theory, particularly the
work on the cognitive representations of semantic categories
by psychologists such as Rosch (1975) and Rosch et al. (1976).
Thus L2 researchers invoke experimental subjects’ possession,
or lack of possession, of schemata related to weddings, Christ
mas, etc., without ever giving a description of what is contained
in such schemata. Given that schemata are simultaneously de
scribed as ‘structures’, this is very odd indeed. It is not always
the case that such description is missing. In the theoretical
literature we find some illuminating descriptions of hierarch
ical structures, either of single vocabulary items, e.g. for the
item ‘canary’ in Collins and Quillian (1969), or for an event
such as a ‘ship christening’ in Anderson and Pearson (1988).
But such fairly detailed structures, while admirable and cap
able of being tested, raise suspicions immediately. For example,
the ‘canary’ schema has, attached to the ‘bird’ node, the fact
that a bird ‘has wings’, ‘can fly’ and ‘has feathers’, but not that
it has a beak or builds nests. The ‘ship christening’ schema,
which is a very loose ‘structure’, and basically in fact is just a set
of unordered components, contains the information that the
christening takes place ‘in dry dock’. But how many readers
are likely to know this?
4. In addition to such lack of explicit description, L2 researchers
entertain remarkably loose notions of the whole concept, so
that schemata can be ‘activated’ or even ‘acquired’ at the drop,
so to speak, of a short passage of introductory reading. But if
the term is to have any use at all, then surely it must describe
mental constructs of some stability, developed over some time
by a sizeable portion of a population.
In the reading literature, different types of schemata have been
suggested. We have already referred to Carrell’s distinction be
tween: ‘content schem a(ta)’, relating to the content of a text read;
‘formal schemata’, relating to the rhetorical structure of the text;
and ‘cultural schemata’, more general aspects of cultural know
ledge shared by large sections of a cultural population. Carrell
(1988a) has also added ‘linguistic schemata’. W hether it is in real
ity useful to apply the same term to notions as different as, say,
72 Reading in a Second Language
our knowledge of the passive voice, of behaviour at a wedding, of
birds, of the meaning and purpose of life, or of newspaper art
icles, is questionable. Here we prefer to use the term ‘background
knowledge’ for content or cultural schema. Formal schemata we
prefer to deal with under Bernhardt’s ‘literacy’ component, and
linguistic schemata under different areas of language. For a detailed
treatment of schemata, the reader is referred to the work of
Cavalcanti (1983).
Threshold levels
Mohammed and Swales found that one of their NNS/Science
subjects, whose estimated proficiency in English was low, was unable
to perform the tasks, although presumably the subject had the
necessary background knowledge. They put this down to the exist
ence of a threshold level, which in this instance they locate at
about Band 5 of the IELTS scale. In terms of an interactive model,
what this amounts to is a claim that there is a level below which a
deficit in one component cannot be compensated for by a corres
ponding strength in another. The term was first used by Clarke
(1979) to account for aspects of data gathered from Spanish speak
ers reading English. Clarke hypothesized that some of his subjects,
who were good readers in Spanish, were unable to transfer those
reading skills because of inadequate mastery of the L2. Recently
Ridgway (1997) has again invoked the threshold effect, this time,
like Mohammed and Swales, to explain why a group of Turkish
readers were unable to utilize background knowledge in reading
English. Thus the threshold level has been used to explain why
either background knowledge, or ‘reading skills’ (possibly equatable
with Bernhardt’s ‘literacy’ component) are unable to compensate
for a lack of linguistic proficiency (Bernhardt, 1991b).
The notion of a threshold level seems commonsensical: no
matter how good our reading skills are in the LI, or how expert
we are in the content area, we are not likely to make much of a
text in a language which is totally unknown to us. The mistake is
to imply or infer that there is a general linguistic threshold level,
valid for all tasks and all subjects. In fact, it seems obvious that
some tasks will require a higher threshold level than others. It is
probably also true that some subjects are able to make more of
their limited linguistic proficiency than others. Thus the thresh
old level must be ‘reset’ for each subject or group of subjects, and
The theory of reading 73
each set of tasks. Given this limitation, it is a constraint which
experimenters (and teachers) should keep in mind.
Literacy
Our final component is again taken from Bernhardt. By ‘literacy’,
she means operational knowledge: knowing how to approach text,
knowing why one approaches it and what to do with it. It includes
the reader’s preferred level of understanding, goal setting and
comprehension monitoring.
Under this heading, we include both ‘cohesion’ and ‘text struc
ture’. Both decisions may seem slightly controversial: Halliday and
Hasan (1976) would clearly class cohesion as a part of language
knowledge. And Alderson (1993) is not unusual in deciding to
including cohesive items in a test of grammar. De Beaugrande
(1980), however, criticises an exclusive focus on the linguistic ele
ments because of the lack of consideration paid to ‘the underly
ing connectivity of text-knowledge and world-knowledge that makes
these (cohesive) devices possible and useful’ (p. 132). The rela
tionship between cohesive elements and text knowledge seems a
good argument for including cohesion under ‘Literacy’.4
As far as text structure is concerned, Carrell’s labelling of know
ledge of such structures as ‘formal schemata’ might suggest that
this topic should be included under the general heading of ‘back
ground knowledge’. However, to return to Bernhardt’s formula
tion of literacy, ‘knowing how to approach a text’ must surely
include knowledge of what kind of text it is, and hence how it is
likely to be structured.
It would also be in line with Bernhardt’s description to place in
this section an account of readers’ strategies. However, for various
reasons, we have chosen to discuss these in Section 2.3. We should
like the reader to note, however, that strategies clearly, in our
opinion, form part of the literacy component.
Decisions as to where to place various elements are not just
part of authorial housekeeping. We argue below that some of the
discussion of reading skills versus language skills has been vitiated
by vagueness as to what ‘reading skills’ actually consist of. It seems
to us that Bernhardt’s ‘literacy’ component is the best place to
look for distinctively reading skills. Hence, what we decide to in
clude in this component becomes crucial in an interesting and
important area of research.
74 Reading in a Second Language
Cohesion
For de Beaugrande (1980), ‘cohesion subsumes procedures where
by surface elements appear as progressive occurrences such that
their sequential connectivity is maintained and made recoverable’
(p. 19). In the 1970s and early 1980s, there was considerable
interest in the effect of cohesion on L2 reading, and many
books designed for classroom use, such as the Focus series (e.g.
Glendinning, 1974), contained exercises designed to train readers
in responding to cohesive devices in texts. Teachers, in our experi
ence, have not always been convinced of the usefulness of such
exercises. There is, moreover, comparatively little good research
in this area. The topic, in fact, is rather more difficult and obscure
than is sometimes recognised.
There is a hint in de Beaugrande’s definition that ‘cohesion’ is
a cover term, and this introduces a problem for the researcher.
Different cohesive procedures may have radically different func
tions. The most obvious difference is that between Conjunction,5
whereby a cohesive device indicates the pragmatic relationship
between two text utterances or blocks, and devices such as Refer
ence, Substitution, and Ellipsis, where the cohesive item replaces
previously occurring parts of the text. The skills employed in hand
ling these two groups are likely to be very different, raising the
( question as to whether it is desirable to investigate the effect of
‘cohesion’ seen as a homogeneous entity. The actual function
played in texts by the different elements is not obvious, and it
might be useful to distinguish between writer functions and reader
functions. It might seem that the function of conjunction is not
difficult to account for: by making the relationships between text
units more transparent, the presence of conjunctive items might
be expected to make texts more transparent for the reader, and
hence easier to read. It is true that both Meyer (1975) and
U rquhart (1976) found that, in the case of native speakers, mark
ing the relationships did not seem to effect recall of a text. On the
other hand, Cohen et al. (1979) found that with a reasonably
extended text, native speakers of English structured their under
standing in part by depending on conjunctives, whereas the non
native readers failed to appreciate the relationships signalled by
the conjunctives.
As various writers have pointed out, however, a sequence of
text units may be coherent without the conjunctive item formally
The theory of reading 75
signalling the relationship. Thus the importance of conjunction
on any particular occasion is open to question, varying as it is
likely to do between readers. Steffensen (1988) argued that if
cohesion was weakly related to coherence, recall of ‘native’ texts,
i.e. relating to the culture of the reader, being more coherent,
should contain more cohesive items than recall of corresponding
‘foreign’ texts. The hypothesis was not confirmed and Steffensen
concludes that the formal teaching of cohesive devices in L2 read
ing should be treated with caution. We would argue, however,'
that her data suggest that the use of conjunction here is writer-
focused, the writers using conjunctions to try to make sense of the
text they are producing.
Urquhart (1976) found that academically gifted LI teenagers
introduced conjunctives when recalling texts which had not ori
ginally contained them. This, and Steffensen’s results, then, serve
to remind us that cohesion may be at least as important for the
writing class.
The textual function of the other main group of cohesive
devices, Pronominal Reference, Ellipsis and Substitution, and its
relationship to reading, is perhaps even more problematic. Differ
ent writers have suggested different functions: continuity (Halliday
and Hasan, 1976); economy (de Beaugrande, 1980); foregrounding
(Chafe, 1972). We find it easier to describe such functions from
the writer’s point of view: ‘Use pronouns to be economical and
avoid repetition’, etc. From the point of view of the reader, the
effect of such cohesion is more difficult to define. With respect to
economy, de Beaugrande refers to the trade-off between compact
ness and rapid access, i.e. pronominal reference is compact but
ambiguities in reference may confuse and delay access. The final
effect on readers may depend on individual skills, language profi
ciency, and knowledge of the world. The effect of foregrounding
might seem to be even more difficult to assess.
In some cases, the effect of cohesive items may be very much
on the surface. Cohen et al. (1979) report that in some cases their
subjects simply did not know the meaning of conjunctives such as
‘thus’. Some similar lack of surface familiarity may have been
responsible for the situation, reported by Berman (1984), that
L2 readers preferred texts in which pronominals and substitu
tion items had been replaced by their lexical equivalents. The
readers may just not have been familiar with the use of the co
hesive items.
76 Reading in a Second Language
As far as the question of how readers handle cohesion is con
cerned, psychological research in the LI area has concentrated
on the relative difficulty of identifying antecedents. One hypo
thesis is that the distance between antecedent and pronoun will
cause processing difficulty (this hypothesis is implicit in Halliday
and Hasan’s description, with its taking account of ‘distance’ in
terms of number of sentences, and ‘mediated ties’ (sequences
of pronouns all with the same antecedent). Rayner and Pollatsek
(1989: 273), reviewing the evidence, conclude that ‘pronoun
reference ... is governed not only by linguistic rules but by a looser
set of discourse guidelines . . . based on the type of verb, parallelism
of form, and whether the noun is still the topic of the discourse’.
Kintsch and van Dijk’s model of reading (see below) relies
heavily on repetition to establish overlap and hence coherence
between propositions (Kintsch and van Dijk, 1978). Depending
on how one defined it, repetition could include, in Haliday and
Hasan’s terms, ‘lexical reiteration , ‘collocation , major aspects of gram
matical reference and substitution. We are a little cautious about such
a seemingly crude approach to coherence, but there seems little
doubt that repetition must be involved in readers’ perception that
the writer is continuing to talk about ‘the same thing’.
It would thus seem very likely that cohesive procedures on the
part of the reader have effects on reading performance. In both
LI and L2, however, the investigation of these effects would seem
to require more subtlety than has been evident up to now. The
importance of the teaching of cohesive procedures in the writing
class should perhaps be emphasised.
Text structure
Brown and Yule (1983) point out that some of the coherence of a
text derives not so much from the presence or absence of surface
cohesive features such as conjunctives, but from underlying text
relationships to which the conjunctives are pointers. From the
1970s onwards, several models have been available which attempt
to map underlying coherence in text. Meyer (1975) points to a
distinction between models which take into account the author’s
organization of the text being analysed, and those which impose
another type of organization. Meyer has in mind analysts like
Crothers (1972) who impose a form of logical structure on texts.
Davies also ignores the author’s organization, basing the analysis
The theory of reading 77
on types of information found in texts, so that, for example, ‘physi
cal structure’ texts contain information about part, location, prop
erty and function (Davies and Greene, 1980; Davies, 1983; Johns
and Davies, 1983). In the discussion below, we shall concentrate
on models of the first type. We have selected descriptions which
are (a) similar enough to each other to make for, hopefully, a
coherent discussion and (b) dissimilar enough to make compari
son interesting. Of the three selected, the models used by Meyer
(1975) and Kintsch and van Dijk (1978) have been used in read
ing experiments; the ‘composite’ model formed by the work of
Hoey (1983) and Winter (1994) has been offered as ‘pure’ text
analysis, independent of reading.
Our models typically consist of (a) some form of ‘unit’ out of
which the larger structure is constructed, (b) a set of relationships
between such units and (c) a larger, global structure, to which the
more local structures are in some way related.
Thus in the work of Winter (e.g. Winter 1994) and Hoey (1983)
the units are natural language clauses, and the local relationships
include Generalisation/Exemplification, and Denial/Correction.
In addition to these ‘clause relations’, Winter and Hoey refer to
basic text structures, such as Situation/Problem/Solution. As far as
the relations between clause and text structures are concerned,
Winter is not very explicit. Hoey (1983: 57) provides suggestions
for mapping clause relations on text structure, along the lines of:
If a Cause/Consequence relation consists of a and b, and a is
identified as a Problem, then if b contains the role of agent, b is
Response.
We don’t know whether such mapping rules have ever been sys
tematically tried on extended texts.
Meyer’s (1975) model is taken from the linguist Grimes (1975).
The basic unit is the proposition, consisting of a predicate and one
or more arguments. There are two kinds of proposition: lexical and
rhetorical. In the first, arguments are related to their predicates by
semantic roles such as agent, patient, range, etc. (cf. Fillmore,
1968). The second kind are rhetorical propositions.
Their main function could be thought of as that of organizing the
content of discourse. They join lexical propositions together, and
they join other rhetorical propositions together.
(Grimes, 1975: 207)
78 Reading in a Second Language
One might reach the conclusion that lexical propositions oper
ated up to clause level, while rhetorical propositions took over to
link clauses or sentences. This is not strictly true, since rhetorical
predicates can occur within clauses: ‘the rhetorical predicates attri
bution, specific, collection and equivalence are frequently found
in simple sentences’ (Meyer, 1975: 45). They do, however, tend to
be superordinate in text structure. According to Grimes (p. 207)
In a tree that represents the underlying structure of a discourse
. . . most of the propositions near the root are likely to be rhetorical,
while most of the propositions near the leaves are likely to be lexical.
Rhetorical propositions are divided into three types: paratactic,
hypotactic and neutral In paratactic propositions, both arguments
are at the same ‘level’; a typical paratactic predicate is collection, of
the sort ‘[there is] A and B and C’, in which A, B and C are equal
in rhetorical level. In a hypotactic proposition, one argument is
superordinate to another. Thus in the Evidence predicate, the
evidence argument is subordinate to the argument for which it
supplies evidence. Neutral predicates can be either paratactic or
hypotactic.
Since hypotactic predicates have the effect of subordinating
one argument to another, and, more generally, since an argu
m ent can consist of a proposition which can include arguments
which . . . etc., the result of the analysis of a text is a ‘hierarchically
arranged tree structure’ called a ‘content structure’.
There are some quite close resemblances between Meyer’s ana
lysis and that of Winter and Hoey. Relationships such as ‘Cause/
consequence’, ‘General/specific’, occur in both, though with dif
ferent terminology. ‘Problem /solution’ again is present in both
analyses. However, in Meyer, it is a rhetorical predicate, capable
of appearing at different levels of the content structure; in Hoey it
is a basic text structure, seemingly different from clause relations
In general, in the 1975 account, Meyer does not distinguish
between local and global relations. Meyer and Rice (1984: 326)
refer to three levels, with micropropositions at sentence level,
macropropositions at paragraph level, where ‘the concern is with
the relationship among ideas represented in complexes of ideas
or paragraphs’, and the third level is ‘the overall organising prin
ciple’ of the text, e.g. causality, problem/solution, etc. (p. 327).
Before begining our account of our third model, that of Kintsch
and van Dijk (1978), we should point out that it is much more
The theory of reading 79
than a description of text organization. In fact, it sets out to model
a large part of the process of reading and remembering an ex
tended text. As such, it contains a considerable amount of discus
sion about how the reader proceeds to take in a certain amount
of information at a time, and operate on this information while it
is in working memory before proceeding to the next chunk of
information. It also deals, in rather less detail, with how the reader,
simultaneous with processing the clauses and sentences of the
text into a coherent whole, the text base, also proceeds to build
up an account of the gist of the text, the macrostructure, or rather,
given the constraints of memory and the cyclical nature of the
processing model, a sequence of macrostructures. To the extent
that Meyer also tested her model empirically, this constitutes a
similarity between her and Kintsch and van Dijk, and a major
difference between these writers, on the one hand, and Winter
and Hoey on the other.
As with Meyer, the basic unit of analysis for Kintsch and van
Dijk is the proposition, consisting of a predicate and arguments.
Predicates ‘may be realized in the surface structure as verbs, adject
ives, adverbs, and sentence connectives’ (Kintsch and van Dijk,
1978: 367). Thus the distinction drawn by Grimes and Meyer be
tween lexical and rhetorical predicates is not observed. Proposi
tions are expressed in the form ( b e t w e e n , e n c o u n t e r , p o l i c e ,
b l a c k p a n t h e r ) , which represents the text ‘encounters between
police and Black Panther Party members’ (p. 377). The system
incorporates a set of semantic role relationships similar to those
described by Grimes, but in the 1978 article, these are not indic
ated, ‘to keep the notation simple’, and in fact, Kintsch and van
Dijk’s trees are much easier to read than Meyer’s.6
These micropropositions are built up into a structure referred to
as a text base, or microstructure, which can be depicted as a coherence
graph. Such graphs are very similar to Meyer’s content structure.
Coherence is maintained largely by referential coherence, depicted
in the graph as overlap of arguments. Thus the two propositions
‘ABC’ and ‘XDC’ achieve coherence by the presence in both of
the argument ‘C’. Sometimes coherence cannot be detected in
this way, and then the reader may need to generate inferences
to maintain coherence. Thus the total list of propositions may
be longer than that contained in the text. Propositions are also
arranged in terms of ‘level’, i.e. some propositions are superordin
ate to others. This ordering is done partly in terms of the simplicity
80 Reading in a Second Language
of the resulting graph structure, and partly in terms of coherence
relations. If, for example, proposition 17 has been nominated as
superordinate, and propositions 15, 10 and 13 relate cohesively to
17, then they are all considered as subordinate to 17. If, then,
propositions 20 and 23 relate cohesively to 13, they are in turn
subordinate to it.
The other main organizational component in Kintsch and van
Dijk’s model is the macrostructure. If the microstructure is ‘the
local level of the discourse, that is, the structure of the individual
propositions and their relations’, the macrostructure is ‘of a more
global nature, characterising the discourse as a whole’ (p. 365).
Thus Kintsch and van Dijk, like Winter and Hoey, have two sep
arate, though connected levels of organization. Kintsch and van
Dijk see the macrostructure as being built up at the same time as
the microstructure; in other words, the former is not a summary of
the latter. The macrostructure is formed partly by the application
of macrorules, which operate on the microstructure, for example
deleting irrelevant propositions, or substituting generalisations for
sequences of detailed propositions. The propositions so derived
are organised by a schema, which is brought by the reader into
contact with the microstructure.
On the whole, Kintsch and van Dijk deal with schemata as ‘con
ventional schematic structures of discourse’ (p. 366), equivalent
to Carrell’s ‘formal schemata’. However, they do make allowances
for a different form of schema.
The reader’s goals in reading control the application of the macro
operators. The formal representation of these goals is the schema.
(p. 373)
In other words, the schema is produced in accordance with the
reader’s goals in reading. Given this view of schema, Kintsch and
van Dijk (1978) envisage three situations. In the first, ‘a reader’s
goals are vague, and the text that he or she reads lacks a conven
tional structure’ (p. 373). In this case, the schema invoked and
the macrostructure would be unpredictable. This is the form of
reading which we shall later be discussing under the term ‘brows
ing’ (p. 103). In the second situation, the text type is highly con
ventional, and this in turn sets clear goals. In the third situ
ation, the goals are again clear, but are set by the reader who has
a special purpose in mind, which may override the text structure.
Such a reader is discussed below as a ‘dom inant’ reader.
The theory of reading 81
Empirical validation of the text structure models
Various aspects of the models above have received some measure
of experimental validation. Kintsch and Keenan (1973) found that
sentences became more difficult to read and understand in rela
tion to the num ber of propositions they contained: the more pro
positions, the more difficulty (see also Weaver and Kintsch, 1991).
Meyer (1975) used her model to investigate the type of informa
tion recalled by subjects after reading an extended prose text.
Information which the model showed to be at a higher level in
the text was recalled better than lower level information. At the
overall organization level, Stanley (1984) found that both natives
and non-natives preferred summaries organised according to a Prob
lem /Solution model, as opposed to texts of the same length and
linguistic difficulty which deviated from the model. Carrell’s findings
(Carrell, 1984) that familiar rhetorical organisation appeared to
help readers, can be added here. Rayner and Pollatsek (1989) again
cite evidence that readers’ recall of the gist of texts tended to
resemble the macrostructure.
Conclusion
Clearly an organized text is more than a string of clauses or sen
tences or propositions. Equally clearly, this fact is likely to be
relevant to the reading process. All the descriptions of text struc
ture above seem to us to have something to contribute in this area
and the most useful will be discussed further in Part 3 in relation
to choice of texts for testing. However, before we leave the topic,
some criticisms remain to be dealt with.
The use of propositional analysis by Kintsch (1974) and van
Dijk (1977) has been criticised by Brown and Yule (1983) on the
grounds that, in spite of its appearance of formalism and objectiv
ity, it is fundamentally subjective. While this is probably true, it
does not seem to us to be exceptional. Most interesting analysis of
natural language rests on consensus; the most formalized genera
tive grammar is supposed to be constructed on the basis of native
speakers agreeing on whether such and such a sentence is accept
able or not. The fact that they are seldom consulted is beside the
point. It is fairly easy to check the extent to which two or more
analysts agree.
Brown and Yule also criticize such analyses for concentrating
on content, and for being unable to deal with staging. This again
82 Reading in a Second Language
is probably true; Meyer (1975) comments on the inability of
Grimes’ analysis to incorporate staging successfully. From our point
of view, given that our main focus is on students’ learning from
texts, it is possible that staging is of minor importance in compar
ison with conventional text structure. This would not be sur
prising: we have already noted how another linguistic aspect of
writing, namely signalling, seems to have comparatively little effect
on the reader.
Rayner and Pollatsek query whether the elaborate apparatus
used by Kintsch and van Dijk is justifiable, or whether it could be
replaced by something less formal, available to the ‘intelligent lay
reader’. This certainly seems to be a valid criticism of an analysis
such as Meyer’s, which is difficult to read, made scoring of sub
jects’ recall scripts, on her own admission, an ‘extremely tedious’
task (p. 101), and which is made complex by the inclusion of case
roles which could not be correlated with any effects in the recalls.
The complex formality is particularly dubious, given the compar
atively crude description of how to arrive at the analysis, e.g.
The topic sentence of the third paragraph of this passage states
that the breeder reactor is the solution to the previously stated
problems. (p. 54)
In Section 5.3 (pp. 275-7), we put forward less formal ways of
analysing texts for our particular purposes.
Finally, we should like to make a comment on specific patterns
of organization in texts, referred to and to some extent critical for
all the analyses above. The question is: How finite are these differ
ent patterns of organization? We have seen that Kintsch and van
Dijk’s macrostructures seem in part to depend on what they refer
to as ‘highly conventionalized text types’. Such text types unques
tionably occur. It is our experience, however, based on teaching
the analysis of written texts, either that the number of such text
types is very large indeed, or that a number of the texts encoun
tered are indeterminate as to overall text structure. Hoey (1983:
34) argues that ‘the number of discourse patterns that can be
built out of a finite set of relations signalled in a finite num ber of
ways is indefinitely large’. W hether the first part of this claim is
true (Hoey appears to be rather coy about listing interclausal
relations), we tend to agree with the second part.
The theory of reading 83
Text types
Text types have been referred to above in relation to Kintsch and
van Dijk’s model. With their reference to ‘highly conventionalised
text types’, it seems that these authors have in mind something
like Swales’ genres (Swales, 1990). Just because they are so spe
cific, Swales’ genres seem to us more useful in accounts of writing.
In our discussion here, we concentrate on an older tradition of
describing rather generalised types which, in our opinion, are of
more general relevance for reading.
De Beaugrande (1981: 307) asserts that ‘. . . reading models
will have to find control points in the reading process where text-
type priorities can be inserted and respected’. The implication is
that the different textual and communicative demands of differ
ent text types will affect reading performance, and further that
some readers may be limited with regard to the types they can
handle.
De Beaugrande mentions Narrative, Descriptive and Argument
ative types. Calfee and Curley (1984) have Object, Sequence and Idea,
in which Sequence relates to Narrative, Idea to Argument, and Object
is loosely related to Description. Moore (1980) adds Exposition and
Enquiry, and Brooks and Warren (1952) have Exposition, Narrative,
Description and Argument as the four ‘basic types of writing’.
An examination of these writers suggests that there are at least
four criteria involved in defining text types.
■Communicative intent. Thus Brooks and Warren define Argu
m ents ‘the kind of discourse used to make the audience . . . think
or act as the arguer desires’.
■Content. Calfee and Curley’s Object category is defined as ‘dis
cussions of things, persons and even ideas’.
■Structure. Calfee and Curley define Sequence as ‘to do with any
account in which progression is the key to the structure’.
■Status of the information. Moore argues that Exposition ‘presents
knowledge already established’, while Enquiry is concerned ‘to
raise questions, . . . and express doubts and possibilities’.
There is an interesting attempt at a taxonomy using communicat
ive intent, status of information, and expected response from reader
in Baten and Cornu (1984). A development of their taxonomy
relating to Expository texts is presented in Urquhart (1996).
84 Reading in a Second Language
There appears to be little work done on the possible differ
ential effects of text types on readers. However, individuals often
report that they prefer fiction (imaginary narrative) or are poor at
reading instructions. McCormick (1992) hypothesized that nar
ratives should be easier than expository texts. This was not con
firmed but she considers that background knowledge is more
important in the case of expository texts. Reading tests, such as
that contained in IELTS, use a fairly informal categorisation of
text types. A combination of different text types with tasks suitably
tailored for particular types (Narrative would seem to invoke a
different set of responses from Exposition) has been put forward
by Urquhart (1996) as a potentially rewarding area of research
(cf. Kobayashi, 1995). We shall revisit the issue of text types in
Chapter 3 when it will be considered as a performance condition
in testing activities.
Having examined in some detail the components of the various
models we need to explore how they might be operationalised by
readers interacting with text(s) for different purposes. As well as a
concern with the nature of text readers may engage with, we need
to look in more detail at the process and product involved in and
resulting from such interactions.
Comprehension
As noted in Chapter 1 (Preliminaries), the focus of attention among
people concerned with reading in education was ‘decoding’,
whereas in the 1970s the focus moved to ‘comprehension’. This
can be seen clearly in Figure 1.2 above, based on ERIC data,
which shows comprehension studies taking off from 1966 onwards.
The switch of attention from decoding to comprehension must
have been very liberating for teachers and researchers, since,
although we have stressed the importance of decoding, there seems
little doubt that a focus on the information being communicated
by texts has more potential for interest. A focus on comprehen
sion is in line with our feeling that this is what reading is ‘about’,
i.e. getting information from written texts. And there is no doubt
that our monitoring of our own reading comprehension is of
major importance. A judgem ent that we have not understood a
text may well leave us unsatisfied, or lead us to re-read it, or
perhaps reject it in disgust.
In spite of this, however, comprehension in some areas re
mains a somewhat elusive entity. Rayner and Pollatsek (1989), for
example, give neither definition nor description of comprehen
sion itself, though, according to their index, the larger part of the
chapter dealing with ‘Representation of Discourse’ is concerned
with ‘comprehension processes’. From the first part of their chap
ter, one might gather that, for them, comprehension equals ‘ “the
meaning of the text” that is being read’ (p. 264).
It is, in fact, our contention that in the teaching and testing of
reading, ‘comprehension’, as generally defined, has been either not
very helpful or positively dangerous. Urquhart (1987) summarises
86 Reading in a Second Language
common assumptions behind the pedagogical view of compre
hension as follows:
Assumption 1. There is such a thing as ‘total’ or ‘perfect’ compre
hension of a text.
Assumption 2. Careful reading, which aims to extract perfect com
prehension, is superior to any other kind of read
ing, e.g. skimming, and is, in fact, the only kind of
reading which deserves the name.
Evidence of the existence of these assumptions is pervasive. As
evidence for Assumption 1 we may quote Fry (1963) as saying that
100 per cent on his comprehension exercises equals ‘perfect com
prehension’; Sticht (1984) argues that claims for the possibility of
reading much faster than listening rest on a confusion between
skimming and scanning on the one hand and reading on the
other. Hence skimming and scanning, which can accept lower
levels of comprehension, are not really ‘reading’ at all.
W hether or not ‘perfect comprehension’ is a feasible goal, we
should reject the assertion by Fry and others that it is equivalent
to a 100 per cent score on comprehension questions. Even with a
short text, it is usually possible to devise a large num ber of ques
tions. The conventional ten questions, often multiple-choice, which
pass as a comprehension test, represent at best a sampling of
information gained by reading. As Lunzer et al. (1979: 66) put it,
How a student completes a test is an INDEX of his capacity to
comprehend; it is not the capacity itself and still less is it the com
prehension itself.
We have said that the typical pedagogical view is ‘dangerous’.
Firstly, by largely insisting on the superiority of one type of read
ing at the expense of all others, it has the effect of disparaging
perfectly normal types of reading behaviour. This does not just
apply to reading types such as skimming and scanning (see Sec
tion 2.4 below), where the view allows a drop in comprehension
in return for an increase in speed. By giving preference to what
we call ‘careful’ reading - i.e. the type normally associated with
study - it also effectively downgrades the value of the type of
reading behaviour many people will adopt when reading, say, de
tective novels for enjoyment, where the reader’s monitor is likely
to accept lower standards of comprehension. Classroom reading
becomes almost exclusively ‘intensive’ reading (see Chapter 4),
The theory of reading 87
and if classroom tasks have any influence on students’ behaviour
outside the classroom, this may well result in slow, laborious read
ing when this is not, in fact, necessary.
The other danger lies in the assumption that a text contains
a finite am ount of information, accessible to all readers. The in
formation is, in other words, ‘on the page’. This clashes with the
currently widely accepted view that the reader interacts with the
text in order to obtain a message, e.g.
Thus, contrary to conventional wisdom, which states that comprehen
sion is the process of getting meaning from a page, comprehension
is . . . the process of bringing meaning to a text.
(Samuels and Kamil, 1988: 206)
This view has serious consequences both for the teaching and
testing of reading. If each reader brings meaning to a text, then
each comprehension is likely to be different. The notion of a
‘right’ answer has now to be treated with care. Variations in com
prehension are likely to come from different background know
ledge brought to the text (though this is not the only possible
source). In a classroom where teacher and students share the
same culture, such variations may not be very large. In EFL or ESP
classrooms in the English-speaking world, however, where teacher
and students may come from a wide range of backgrounds and
cultures, the possibility of varying comprehensions may become a
major problem. And if this is true for the classroom, it is even
more true for international EFL reading tests.
Urquhart (1987) distinguishes between ‘comprehensions’, re
ferring to differences brought about by readers setting themselves
different levels of acceptable comprehension (i.e. between reading
a book for an examination and reading it for light amusement),
and ‘interpretations’, referring to differences resulting either from
different readers bringing different information to a text, or the
same reader at different times, bringing a different mind-set. While
the terms may not be perfectly chosen for keeping the different
factors apart, the distinction should serve as a reminder of the
number of variables likely to be present in many teaching or test
ing situations (discussed in Chapters 3 and 4 below).
We should mention here the notion that the ‘ideal’ compre
hension consists of the recovery of ‘author’s m eaning’. We do not
think that it can be doubted that readers often strive to do this; it
is an im portant aspect of careful reading, and, since it involves
88 Reading in a Second Language
close attention to textual features such as use of conjuncts (‘how
ever’, etc.), headings, the ordering of information, and so on, it is
something that can partly be taught. We have only two doubts
about it being used as the ‘ideal’ comprehension. First, it can
never be fully achieved. We can never be sure that we have totally
entered the writer’s mind. It could be said, however, that it is in
the nature of all good ideals never to be achieved. Secondly, a
careful attempt to recover author’s meaning is not characteristic
of all reading; the reader engaged in scanning, for example, may
pay little attention to author’s intentions. Such attention is, in
fact, characteristic of careful reading, particularly where this is
submissive (see below). As such, it is important, but cannot be a
definition of comprehension in general.
We have just argued that ‘author’s meaning’ can never be recap
tured in its entirety. It can be argued, however, that just recapturing
author’s meaning is not enough. Advocates of ‘critical reading’
(cf. Fairclough, 1995; Wallace, 1992a&b) point out that texts are
dependent on presuppositions stemming from their authors’ own
particular world view, their ‘ideology’. It then becomes the duty of
the critical reader, by spotting such ideological presuppositions,
to evaluate a text in its cultural context.
It is clear that comprehension cannot be viewed simply as the
product of any reading activity. Rather, in any reading situation,
comprehension will vary according to the reader’s background
knowledge, goals, interaction with the writer, etc. Comprehension is
a useful term to contrast with decoding, otherwise it is best perhaps
taken as the product resulting from a particular reading task, and
evaluated as such.
Skills
A reading skill can be described roughly as a cognitive ability
which a person is able to use when interacting with written texts.
Thus, unlike comprehension, which can be viewed as the product
of reading a particular text, skills are seen as part of the general
ized reading process.
Skills have been a major area of reading research over recent
years, as can be seen in Figure 2.1, based on data from ERIC.
Skills have been recommended by Lunzer et al. (1979) and
Vincent (1985) as a means of structuring reading syllabi, and are
3500-
- Skills
3000 ■ ■Strategies
2500 ■
2000 -
1500 -
1000 -
500-
n J
The theory of reading
1966 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
Figure 2.1 Number of articles and other publications published between 1966 and 1996 that mention ‘skills’ and
89
‘strategies’ in their title or in ERIC’s index or abstract (based on data from ERIC).
90 Reading in a Second Language
probably still the best framework for doing this. They have also
been used for test construction, notably in the ELTS test and
TEAP (Weir 1983a, 1990). Useful as the concept of skills has been,
there are considerable problems attached to it. Williams and Moran
(1989: 223) point out that while a number of skills taxonomies
exist, there is little consensus concerning the content of the
taxonomies or in the terminology used to describe them. Below we
give a selection of fairly typical taxonomies, which serve, inciden
tally, to justify Williams and Moran’s comment.
1. Davis (1968):
■Identifying word meanings.
■Drawing inferences.
■Identifying writer’s technique and recognising the mood of
the passage.
■Finding answers to questions.
2. Lunzer et al. (1979):
■Word meaning.
■Words in context.
■Literal comprehension.
■Drawing inferences from single strings.
■Drawing inferences from multiple strings.
■Interpretation of metaphor.
■Finding salient or main ideas.
■Forming judgements.
3. Munby (1978):
■Recognising the script of a language.
■Deducing the meaning and use of unfamiliar lexical items.
■Understanding explicitly stated information.
■Understanding information when not explicitly stated.
■Understanding conceptual meaning.
■Understanding the communicative value of sentences.
■Understanding the relations within the sentence.
■Understanding relations between parts of text through lex
ical cohesion devices.
■Interpreting text by going outside it.
■Recognising indicators in discourse.
■Identifying the main point of information in discourse.
■Distinguishing the main idea from supporting detail.
■Extracting salient points to summarise (the text, an idea)
■Selective extraction of relevant points from a text.
The theory of reading 91
■Basic reference skills.
■Skimming.
■Scanning to locate specifically required information.
■Transcoding information to diagrammatic display.
4. Grabe (1991: 377):
■Automatic recognition skills.
■Vocabulary and structural knowledge.
■Formal discourse structure knowledge.
■Content/world background knowledge.
■Synthesis and evaluation skills/strategies.
■Metacognitive knowledge and skills monitoring.
It is comparatively easy to criticise some of these taxonomies,
even at first sight. Davis’s ‘Finding the answers to questions’ seems
to include all the others, and prompts the query: ‘Which ques
tions?’ It is hard to believe that the assignment of separate status
to ‘Drawing inferences from single strings’, and ‘Drawing infer
ences from multiple strings’ in Lunzer et al.’s taxonomy is really
justified. There are, however, wider questions to ask about the
taxonomies.
How inclusive is a skill? Clearly, in the taxonomies above, some
skills seem more inclusive than others; Grabe’s taxonomy, for
example, uses very general categories, virtually equivalent to know
ledge areas. Rayner and Pollatsek (1989) begin their preface,
‘Reading is a highly complex skill. . .’ (p. ix). Clearly, if reading
itself is a skill, it must be possible to break this down into different
levels of component skills categories. Williams and Moran (1989)
suggest a rough distinction between ‘language related’ skills and
‘reason related’ skills.
Various attempts have been made to arrange skills into hier
archies. Of the taxonomies above, that of Lunzer et al. is so
arranged, with the ‘lowest level’ skill at the top. Munby’s tax
onomy was not intended to be hierarchically arranged, though in
his review of the work Mead (1982) argued that it should have
been, on the grounds that some skills seem to presuppose the
learning of other skills. Some possible criteria for ranking skills
are as follows:
(a) Logical implication. One component in the system can logically
be considered to presuppose all components below it. This is
the criterion used by Bloom et al. (1956, 1974).
92 Reading in a Second Language
(b) Pragmatic implication. A reader displaying one skill in the sys
tem can be assumed to possess all the ‘lower’ skills.
(c) Difficulty. The components are arranged in order of increas
ing difficulty.
(d) Developmental. Some skills are acquired earlier than others.
Some syllabi, rather unwisely, in our view, assume that readers
pass through a period of comprehending ‘explicitly stated’
information before they arrive at the stage of inferencing.
(e) Discourse level A skill is ordered with respect to the size or
level of the discourse unit it relates to. We have not found
explicit mention of this criterion in the literature, but suspect
it is commonly used by teachers and applied linguists. It would
explain a tendency to rate ‘thematic’ questions, aimed at the
whole of a text, as being ‘high level’.
There have been various attempts to investigate the psycholo
gical reality (or separateness) of different skills. On the basis of
tests based on the taxonomy given above, Lunzer and Gardner
(1979) concluded that there was no evidence for the separate
existence of the skills, and that:
reading comprehension should not be thought of in terms of a
multiplicity of specialized aptitudes. To all intents and purposes such
differences reflect only one general aptitude: this being the pupil’s
ability and willingness to reflect on whatever it is he is reading.
(p. 64)
This is the so-called ‘Unitary Hypothesis’, as opposed to the ‘Multi-
divisible Hypothesis’. Lunzer et al. also investigated the pragmatic
validity of their hierarchy of skills, testing the hypothesis that:
there exists an identifiable group of pupils whose performance on
higher-level tasks is defective to a degree which would not be pre
dicted on the basis of their performance on lower-level tasks.
(p. 61)
Again they found no evidence for this, although they did find
some evidence of a difficulty hierarchy. Further research in these
areas is discussed in Part 3 in relation to testing.
Before we leave the topic of skills, we can make the following
general comments:
1. The possession of a specialised comprehension skill, if we hypo
thesise such a thing, does not guarantee success in completing
The theory of reading 93
a particular task. We are all capable of making mistakes. This
argues for the need for a longitudinal study into the reality or
otherwise of such skills.
2. The conclusions by Lunzer et al., cited above, can only apply to
comprehension. Virtually everyone concedes that decoding is a
separate component/skill, since, as pointed out earlier, normal
young children can comprehend, without being able to de
code, while some disabled individuals can decode without com
prehending. Hence we have to accept at least a twin-skills model
(this discussion is taken further in Section 3.2 (p. 120).
3. Pragmatic validation is not the only form of justification for
skills taxonomies. Difficulty has been mentioned as a criterion
for a hierarchy, and most of us would presumably agree that,
on average, ‘Understanding the relations within the sentence’
is an easier skill than ‘Extracting salient points to summarise
(the text, an idea)’. The L2 learner, forced back to an earlier
developmental stage by the difficulties of unfamiliar syntax or
lexis, may well have to wait before being able to summarise
a text.
Finally, as said before, skills are useful tools for the development
of both teaching materials and tests. In spite of the doubts that
have been raised, we shall continue to make use of the taxonomies.
Strategies
Reference to Figure 2.1 shows that strategy research is a later
development than skills research, only becoming popular in the
1980s. The research methodology is very different from that asso
ciated with skills. In much skills research, the investigator begins
with a taxonomy of skills, arrived at, perhaps, by means of text
analysis. The psychological validity of this taxonomy may then be
empirically checked. In strategies research, on the other hand,
the researcher begins by having subjects (often divided into ‘good’
and ‘poor’ readers) read a text and, either retrospectively at the
end of reading or at points during reading, report on what they
are doing. The strategies revealed by these reports are then cat
egorized. If a prior division of subjects has been made, an attempt
is often made to equate some aspect of strategy use to the ‘good’
or to the ‘poor’ group.
Because of the time-consuming methodology, the num ber of
subjects tends to be small, and there is an emphasis on qualitative
rather than quantitative results. However, it should be noted that
the uncovering of strategies is pragmatic; hence, unlike skills, their
psychological validity does not need to be investigated.
Two fairly representative examples of strategy research are the
investigations of Olshavsky (1977), who did her work with English-
speaking readers, and Sarig (1987), who worked with bilingual
Hebrew- and English-speaking subjects. Since the focus here is on
the actual strategies used by readers, together with definitions of
what constitutes a strategy, no details are given as to the hypo
theses examined, or the discussion of strategy use.
The strategies detected and categorised by Olshavsky were as
follows:
■Word related: Use of context to define a word, synonym substitu
tion, stated failure to understand a word.
■Clause related: Re-reading, inferences, addition of information,
personal identification, hypothesis, stated failure to understand
a clause.
■Story related: Use of information in story to solve a problem.
The theory of reading 95
Sarig used a similar technique to investigate the behaviour of
Hebrew-speaking students reading in English and in Hebrew. Again
a ‘think-aloud-when-reading’ technique was use, though it is not
clear from Sarig’s account precisely when they verbalised. Sarig
refers to responses to any particular problem as ‘moves’, and strat
egies as being combinations of moves. To avoid confusion, how
ever, her ‘moves’ will be referred to as strategies. The strategies
she uncovered and categorised were as follows:
■Technical aid: Skimming, scanning, skipping.
■Coherence detecting: Identification of macroframe, use of content
schemata, identification of key information in text, etc.
■Clarification and simplification monitoring: Syntactic simplification;
using synonyms, circumlocutions, etc. Change of planning, mis
take correction, ongoing self-evaluation, controlled skipping,
repeated reading.
Even given the ten years between the two papers, it is striking
just how different are the two lists of strategies. While, as claimed
above, it can be argued that the detection of strategies, unlike
that of skills, is pragmatic, there is clearly an element of subjectiv
ity both in identifying and in categorising them. This subjectivity
may be reduced if we can agree on a definition of strategies.
Definitions
Both Olshavsky and Sarig view reading as ‘a problem-solving pro
cess’. Admittedly there may be some problems defining ‘problem ’,
but, in commonsense terms, we can regard strategies as ways of
getting round difficulties encountered while reading. Thus, initially
at least, strategies can be seen as responses to local problems in a
text. We should also include in our definition a reference to the fact
that the response must be a conscious one. Olshavsky claims that
a strategy is ‘a purposeful means of comprehending the author’s
message’ (p. 656; our emphasis). Pritchard (1990: 275) defines a
strategy as ‘a deliberate action that readers take voluntarily to*
develop an understanding of what they read’. Cohen (1998: 5)
points out that the question is controversial but comes down firmly
on the side of conscious choice:
In my view, the element of consciousness is what distinguishes strat
egies from those processes that are not strategic.
96 Reading in a Second Language
Given a definition along the lines we have indicated above, it is
hard to accept some of Olshavsky’s strategies as such; in par
ticular, ‘stated failure to understand a word’ or ‘stated failure to
understand a clause’ as strategies. Olshavsky emends this by dividing
her strategies into ‘problem identification’ (i.e. monitoring), and
‘problem solving’. Another of her ‘strategies’ involves the reader
substituting synonyms during recall. But this is a general feature
of readers’ recall of text (cf. Steffensen and Joag-Dev, 1984).
manifests towards the text, e.g. skipping, are likely not to appear
on Munby’s list. Obviously, since monitoring is a reader-directed
activity, many of its manifestations will not appear, e.g. self-
evaluation in general, admitting failure to understand part of
the text, are again unlikely to appear. The fact that some such
activities do appear - e.g. scanning and skimming - may be an
indication that Munby’s ‘skills’ do, in fact, include a num ber of
‘strategies’.
■Strategies represent conscious decisions taken by the reader,
skills are deployed unconsciously. Another way of phrasing this
is that skills have reached the level of automaticity. Certainly
many of Munby’s skills, such as lexical recognition and syntactic
98 Reading in a Second Language
parsing, can be assumed to have reached automatic levels in LI
or advanced L2 readers, and hence would not be reported in
strategy research. There are difficulties associated with this cri
terion; all the descriptions detailed above include ‘re-reading’.
The regressions reported from eye-movement research could be
considered a type of re-reading, and it might be difficult to
decide just how conscious readers were of regressing. However,
the criterion of ‘conscious v automatic’ seems a good one to us.
■Strategies, unlike skills, represent a response to a problem, e.g.
failure to understand a word or the significance of a proposi
tion, failure to find the information one was looking for, etc.
This criterion is closely related to our first one: at a local level,
something only becomes a problem if one becomes aware of it.
It is true that the term ‘problem ’ poses some difficulties in
itself. We have been using the term to refer to local difficulties
encountered when reading a text. However, in Newell and
Simon’s theory, used by Olshavsky, a problem may be anything
in the task environment which stands between the organism
and its goal. In this wider sense, if we decide to read a book for
information on a particular subject, the whole text of the book
becomes a problem. We discuss what we call ‘global’ strategies
in the next section.
On the whole, however, we agree with the distinction drawn by
Williams and Moran (1989: 223):
A skill is an ability which has been automatised and operates largely
subconsciously, whereas a strategy is a conscious procedure carried
out in order to solve a problem.
Note
We can expand this to point out that there is no necessary cor
relation between a particular reading behaviour and a particular
genre of text. We might assume that people are more likely to
apply their ‘careful reading’ processes to a study text, but text
books can be read for amusement, not learning, while texts taken
from tabloid newspapers may be scrutinised with great care. We
should also point out that readers are under no obligation to
maintain a particular reading behaviour throughout the length of
a text; they may switch from careful reading to skimming to search
reading to scanning and back to careful reading over a small
number of pages.
We can now compare and contrast the five provisional types,
in order to examine the factors which are involved in differenti
ating them.
1. Skimming, search reading, scanning and careful reading are
distinguished from browsing by the presence in the first four
of a clearly defined goal. The goals, of course, differ widely,
but in all four cases, the reader can be assumed to know before
reading what it is they want from the text.
2. Search reading, scanning, skimming, and possibly browsing are
distinguished from careful reading by the factor of selectivity.
In the last type, all the text can be presumed to be examined;
in at least the first two types, and probably the third, the reader
will deliberately either avoid, or pay minimum attention to
some parts of the text. In scanning, the parts ignored may
constitute a majority of the text.
The theory of reading 105
3. In careful reading, and in skimming, the reader makes a con
scious effort to construct a macrostructure, the gist of the text.
In careful reading, this is likely to be done by reference to the
whole text, in skimming from parts of the text. In scanning,
there is no attempt to construct a macrostructure; in browsing,
some vague notion of the topic may be built up, but without
any attempt to retain it; in search reading it is probable that
only certain key ideas in the macrostructure will be sought.
Expanding a model
Parse syntactic
structures
LONG-TERM MEMORY^
Macrostructure:
Integrate with propositions A, C y
representation
of previous text
No End of
sentence?
Sentence
wrap-up
Figure 2.2 A development of Just and Carpenter’s model of
the reading process.
Notes
1. Gough’s ‘decoder’ converts letters into phonemes. This is in line with
his apparent assumption that the end-product of reading is a spoken
sentence. Since we don’t share that assumption, the decoder has been
left out.
2. We do not know of experiments testing the effect of lack of punctua
tion on reading. However, in Britain at least, the fact that teachers can
read students’ essays seems to be evidence that it is possible.
3. Rayner and Pollatsek admit that Goodman’s account can be viewed as
interactive, but consider it as basically top-down because ‘bottom-up
The theory of reading 109
processing plays such a minor role’ and there are ‘so little constraints
on the interactions’. In a sense, though, this is a repetition of their
criticism that his account lacks precision, a criticism earlier made by
Gibson and Levin (1974).
4. De Beaugrande’s reference to background knowledge here should
remind us of the crucial role of such knowledge in, for example, iden
tifying pronominal reference. This role suggests that cohesion tests
may be a useful way of assessing the effect of background knowledge
on reading.
5. We are using the terms found in Halliday and Hasan (1976). Other
writers on cohesion, e.g. de Beaugrande, and Quirk et al. (1973) use
similar though not identical terminology.
6. It should be remembered that Kintsch and van Dijk’s model sets out
to describe the actual reading process, in which propositions are pro
cessed in cycles, generally about 4 at a time, with selected propositions
being carried over to the following cycle.
This page intentionally left blank
111
3
Testing reading
comprehension(s)
Focusing on comprehensions
Testing components
Until the early 1980s the major concern in the testing of reading
was with the issue of methodology. However, in the 1980s one
noted a switch in concern away from how to test reading towards
a concern with what we were trying to test; a concern with the
nature of the reading construct itself - in broad terms, a move
Testing reading comprehension (s) 121
away from a focus on method to a focus on the content of reading
tests. The next section examines the componentiality of reading
and the implications of this for the testing of reading.
The issue of how to test, i.e. deciding which are the most
suitable test formats, is still a key decision and we will return to
this in Section 3.4. Once test developers have a clear idea of the
performance conditions that need to be built into a test (see
Section 3.3 below) and the reading skills and strategies that are to
be tested, only then can they make decisions on formats. In the
past the methods’ tail has often wagged the testing dog and for
mats have been chosen without sufficient thought a priori to what
is being tested.
The reader will note that Williams and Moran (1989) and Grabe
(1991) identify reading components from all four levels A, B, C
and D in Table 3.1. Their inclusion of word-level components
relating to more specifically linguistic comprehension is indicative
that this is seen by many people as an important part of reading,
not as something separate. The contribution of the latter to tests of
reading ability is an important issue that will be taken up below.
Global comprehension, which can be related to Kintsch and
van Dijk’s macrostructure (Kintsch and van Dijk, 1978) normally
refers to comprehension beyond the level of micropropositions -
from macropropositions to discourse topic. Local comprehension
refers to the decoding of micropropositions and the relations
between them.5
Quantitative research
A number of empirical test-based studies, typically using factor
analysis, have cast doubt on the multidivisible nature of reading
(e.g. Lunzer et al., 1979; Rosenshine, 1980; Rost, 1993). Factor
analysis is a statistical procedure for extracting the extent to
which putatively different variables - in our case the so-called
‘skills and strategies’ in reading; reading types - in fact function
in a similar manner. If a number of putatively different skills and
strategies function in a very similar manner it is said that they
‘load on the same factor’ and we have at least to entertain the
possibility that they are not different at all, only a single construct
in different guises. If all conceivably different skills and strategies
load on a single factor, we have to consider the strong possibility
that there are in fact no skills and strategies at all, only* a single
undifferentiated ability: reading. If some putative skills and strat
egies function in a statistically similar manner and load fairly heavily
on one factor, while other putative skills and strategies function
statistically in another manner and load on a second factor, then
this is evidence that reading is at least bi-divisible.
For example, the work of Lunzer et al. (1979) is often cited by
reading specialists as evidence that it is not possible through test
data to differentiate between the so-called subskills and strategies
in reading. This study is said to show that reading (at least as defined
by completing their reading tests) is a single undifferentiated abil
ity. However, it is interesting to note that while only one principal
factor - presumably undifferentiated reading - is identified in this
study through factor analysis, there does appear to be some doubt
(Lunzer et al., 1979: 55-7) concerning the strength of the loading
of test items testing word-meaning on that principal factor.
The reader must also remember that Lunzer et al.’s study
(as with many of those finding no evidence of multidivisibility)
was conducted on native speakers of English; in fact primary school
pupils, presumably still largely free of the specific linguistic prob
lems experienced by some non-native speakers. Our particular
concern in this book is different from that of Lunzer et al. in that
we are primarily interested in testing adult L2 readers who will
tend to be spread out across the language ability range.
Testing reading comprehension (s) 127
The most recent investigation conducted by Rost (1993), again
on native speakers, found strong evidence of unidimensionality,
leading Rost to warn against differential skill component inter
pretation for all available reading comprehension tests (1993: 88).
However, once again, it is important to note that, in the reported
factor analysis, a second factor that Rost believes to be vocabulary
did emerge when the factors were rotated. Rost (1993: 80) indeed
cites earlier research where ‘two factors of reading comprehen
sion, namely “vocabulary” or “literal reading” on the one hand,
and “general reading comprehension” or “inferential reading” on
the other’ emerged from the data (Johnson and Reynolds, 1941;
Stoker and Kropp, 1960; Vernon, 1962; Pettit and Cockriel, 1974;
Steinert, 1978).
There is further evidence in the literature that the phenom
enon of vocabulary loading on a separate factor is not uncommon.
Davis (1944) identified two important separate factors in read
ing as ‘memory for word meanings’ and ‘reasoning in reading’
(a combination of weaving ideas together and drawing inferences
from them). Similarly, in a later study (Davis, 1968) a recogni
tion vocabulary test accounted uniquely for a sizeable proportion
(32 per cent) of the non-error variance. There is also evidence in
Spearritt’s reanalysis (1972) of Davis’s earlier data that vocabulary
tests are differentiable from the single basic ability ‘reasoning in
reading’ measured by other labelled reading components in the
reading comprehension tests used in the study. Spearritt (1972:
110) concluded:
Vocabulary is the best differentiated, as in both the Davis and
Thorndike analyses ... it could not in fact be subsumed under one
general factor with the other three skills.
Similarly, Rosenshine (1980: 543) admits to the fact that in three
out of the four analyses done on Davis’s data the one unique factor
that emerged as separate from the others was vocabulary (‘remem
bering word meanings’), the only exception being Thorndike’s
(1973) analysis which he categorises as being less sophisticated than
Spearitt’s later study. Rosenshine cites data from Berg in support of
the non-divisibility position, but in four out of the five studies sum
marised by Berg (see Rosenshine, 1980: 544) lexical competence
appears as a separate factor. Farr (1968) found two factors - one
clearly vocabulary, which loaded heavily on the three vocabulary
measures, and one that could be labelled as comprehension.
128 Reading in a Second Language
Though the quantitative studies reported above seem to suggest
that it may not be consistently possible to identify multiple, separate
reading components, there does seem to be a strong case for con
sidering vocabulary as a component separate from reading com
prehension in general. Given that most factor analyses in the studies
reported above produced more than one factor, it would be difficult
to maintain that reading is a unitary ability. Furthermore, even if
the components which load more heavily on a second factor also
load on the first general reading factor, it might be appropriate
only to select test items which load heavily on the first factor when
developing a measure of general reading ability. Alternatively, if
vocabulary is considered to be part of reading, a bi-divisible view
of reading would seem to be more appropriate.
Qualitative research
As part of a new wave of qualitative investigation in language
testing studies, Alderson (1990a; see also Alderson and Lukmani,
1989) investigated the reading component question through the
judgem ent of experts on what reading test items actually test. In
this study, groups of experts —usually students on MA courses —
were presented with a long list of posited reading components,
and asked to identify cold (‘heuristically’) what items in a pilot
version of an EAP reading test were measuring in terms of the list.
The resulting lack of agreement on assigning particular ‘skills and
strategies’ to particular test items, i.e. on agreeing what an item
was testing, and even whether an item was testing a ‘higher level’
or ‘lower level’ component, could be taken as evidence of the
indivisibility of the reading ‘skill’, or at the very least could be
seen as casting doubt on the feasibility of distinguishing reading
components. Nevertheless, these conclusions need to be subjected
to scrutiny.
The authors of the reading test items used in the Alderson
(1990a) study were aware of the possible overlap between compon
ents tested by individual items. At the time, Weir (1983a: 346)
had summarised the approach to the design of the reading com
ponent in the TEEP as follows (for ‘skills’ in this quotation, read
‘skill components’):
... we aimed to cover as many of the enabling skills in each of the
reading subtests ... as was feasible ... we indicate opposite each
item in the reading sub tests what the Project Working Party and
Testing reading comprehensions) 129
other experts in the field considered to be the major focus of that
item. We were aware that though an item might be seen to be
dependent on a particular enabling skill for successful completion,
other skills and strategies might be contributing to getting the answer
right. We realised that the skills and strategies we were sampling
were not necessarily discrete.
Thus any conclusions regarding the feasibility of distinguishing
separate components, based on the inability of judges in the
Alderson (1990a) study to agree on what single component was
tested by individual items, must necessarily be open to question.
(For a discussion of further weaknesses in this study, see Weir et
al., 1990 and Matthews, 1990). Furthermore, any similar investiga
tions in this area should ensure that the experts involved share a
common understanding of the categories of description employed
in the study. Lumley (1993) emphasises the need for clear defini
tions and a common understanding of the terms employed, in
particular ‘higher level’ and ‘lower level’ components, if the attempt
to assign components to test items is to be meaningful.
In contrast to Alderson’s findings, there is an alternative liter
ature which suggests that it is possible with clear specification of
terms and appropriate methodology for testers to reach closer
agreement on what skills and strategies are being tested (Anderson
et al., 1991; Brutten et al., 1991; Kobayashi, 1995; Lumley, 1993;
Weakley, 1993; Weir et al., 1990).
,
Data from ESP Centre Alexandria9 Egypt
Similar differential performance is emerging in data from a bat
tery of EAP tests under development by the Testing and Evalua
tion Unit at the ESP Centre in Alexandria, Egypt. These are
short-answer question tests designed to test, separately, reading
comprehension types A, B, C and D in Table 3.1 above. Point
biserial correlations show that items in section 1 of the battery
testing at levels A and C correlate more with their own subtest
than they do with level D (the microlinguistic items), and vice
versa. Similarly, there is a small subset of students who, while
coping well with A and C operations, experience more difficulty
with the specifically microlinguistic items. So here again there is
the same phenom enon of differential performance on global as
against specifically microlinguistic items. It should be noted that,
in the Alexandrian case, the reading test format differed from
that used in the Reading study.
Text type
Unfortunately, as we noted in Chapter 2, many text analysis pro
cedures are so detailed and produce so much data that they can
Testing reading comprehension (s) 141
be of little value to testers in making decisions on whether or not
to select a text for a test. Deciding what are appropriate text types
for the test population is a crucial step in test development. This
decision is currently best informed by needs analysis of the students’
target situations and by careful examination of the texts (and tasks)
used in other tests and teaching materials aimed at the particular
test population. (See Section 5.3 on page 274 for exemplification
from the development of the Advanced English Reading Test
(AERT) project in China.)
There has been a consensus for a num ber of years that texts
used both for teaching and testing should be ‘authentic’, though
this requirem ent has become more a matter of common sense
than, as originally, of almost missionary dogma. There has, however,
been a suggestion that in dealing with heterogeneous test popula
tions fully genuine texts are not essential. This is a view supported
by the work of Lewkowicz (1997) which indicates that texts might
only need to resemble those that the candidates will process in
the future in terms of salient ‘authentic features’. She argues that
full authenticity of text may not be necessary, attainable or desir
able. However, until such findings are substantiated by stronger
evidence than her initial pilot studies, we would be best served
by selecting texts which exhibit as many salient features of target
situation texts for the population as is possible.
Obviously the skills and strategies it is wished to test will also
influence selection: problem/solution, causative or comparison
texts from journals or textbooks may well lend themselves better
to testing reading carefully for main idea(s) comprehension than
more descriptive texts with lots of detailed information. Though,
as Carrell (1984: 464) points out, this might be unfair to certain
native language groups such as Arabs for whom it is a preferred
rhetorical pattern and more facilitative of recall than other patterns.
In careful reading the texts may not necessarily have clear main
ideas for selection and main ideas might have to be constructed
through propositional inferencing, whereas in skimming and search
reading they should be explicit.
Where candidates are expected to skim or search read lengthier
texts, these would ideally have a clear overt structure and be clearly
sequenced with a clear line of argument running through them.
A journal article or chapter from a textbook with clear sections
and headings, and where paragraphs contain topic sentences in
initial position which signal the information to be presented, may
142 Reading in a Second Language
prove suitable for testing these expeditious reading strategies. Prob
lem and solution, causative and comparison texts may have the
clearest, tightly organised structures (Carrell, 1984; Meyer and
Freedle, 1984; Meyer, 1975). One might also look for texts which
are overtly organised into sections. Texts without a clear structure
may well be authentic but they do not lend themselves easily to use
in testing expeditious reading, just as in real life they are difficult
to follow quickly, to summarise or to make notes on. A collection
of description texts (Carrell, 1984; Meyer and Freedle, 1984) may
be the best vehicle for testing scanning for specific detail.
We realise that the guidelines we have presented are, at best,
rather skeletal. Lewkowicz (1997) points out that a key area for
future research is in determining the text types that allow the best
testing of the various skills and strategies. Kobayashi (1995) has in
fact made a good start in this area. She found (pp. 266-7) that:
there seems to be a close relationship between text type and question
type. For example, tightly organised texts tended to produce more
questions on main ideas than less organised texts. At least, it seemed
easier to generate a variety of questions when texts were highly
organised. If texts were loosely organised, on the other hand, ques
tions tended to focus on details or literal understanding.
At lower levels, texts employed in tests are often artificially con
structed or simplified because of the restrictions imposed by the
structures and lexis available to the students. This may seriously
constrain the range of strategies and skills that can be tested and
it may be that expeditious strategies are simply not testable at this
level because of length constraints.
In Section 4.3 we return to the issue of text selection for teaching
purposes and a num ber of criteria for selecting text are explored
in detail.
Propositional content
Topic familiarity
Topic familiarity is increasingly seen as one of the criterial deter
minants of performance in reading tests (Khalifa, 1997; Aulls, 1986:
124-5). This obviously overlaps with the nature of the existing
schemata candidates possess (see Section 2.2 for a full discussion
of this). Weir (1990, 1993) points out that the topic should be
selected from a suitable genre, at an appropriate level of specificity,
and should not be culturally biased or favour any section of the
test population. The issue of what is a generally accessible text
remains with us. In those situations where we are writing tests for
heterogeneous groups of students, we are by necessity forced to
select texts with a wider appeal than is the case when we have a
more homogeneous group. Clapham (1996a/b) suggests that in
her research it is only with more specific texts that background
knowledge has a significant effect on text comprehension. We
might also need to consider the effect of interestingness. The
work of Spilich et al. (1979) suggests that this is an important
influence on the interaction between readers and text.
The content of a text should be sufficiently familiar to candid
ates so that candidates of a requisite level of ability have sufficient
existing schemata to enable them to deploy appropriate skills and
strategies to understand the text.
Royer and Cunningham (1978) suggest that texts selected should
be within the knowledge base of the candidates; they introduce
the concept of ‘tailoring’ texts for specific audiences. Such match
ing may not be available in tests with large heterogeneous popula
tions and a cruder more general strategy may be unavoidable.
As part of the a priori validation process the familiarity of the
text can be established through survey and we would want to
avoid texts at the extremes of a familiarity continuum (Khalifa,
1997). In general, a text should not be so unfamiliar that it can
not be mapped onto a reader’s existing schemata. Conversely, the
content should not be so familiar that any question set can be
144 Reading in a Second Language
answered without recourse to the text itself (Roller, 1990). This
should be checked rigorously, whichever of the formats for testing
reading are employed. A key pretesting check is to determine if
any of the questions are answerable without recourse to the text,
and any such questions should be removed. The reader is referred
to the discussion of schemata in Section 2.2.
Texts currently employed in testing reading tend to be genuine
and undoctored and, as far as possible, are selected with appropriacy
for the target situation needs of the test takers in mind (West,
1991). However, it is not always easy to determine which texts are
most appropriate for which test takers; a postgraduate in business
studies may well have come from a science or engineering academic
background; an undergraduate may be studying a variety of sub
jects. The practicalities of constructing multiple forms for an EAP
proficiency test for different subject areas are intimidating, and
though there is some evidence that performance is enhanced by
background knowledge in the content area of a reading compre
hension passage (Alderson and Urquhart, 1985) the evidence is
not conclusive (see Clapham, 1994;Ja’far, 1992; Koh, 1985). How
ever, it does appear that the more specific a text the more import
ant the contribution of background knowledge to comprehension
(Clapham, 1994: 281-2), the less specific a text the more important
the contribution of language proficiency. This would encourage
us to select texts with a preponderance of semi-technical as against
technical vocabulary. The development of computer programs for
concordancing of texts is already in use in the People’s Republic
of China for checking this facet of texts in test design.
As far as a canonical culture is concerned, students sitting an
EAP test, for example, should not, if possible, be faced with texts
which are too far outside their academic culture. If the texts are
selected well, testees should be inside what Swales (1990) has termed
the ‘discourse community’. There remains the problem of the
extent to which the test developers also belong to the appropriate
community.
If general texts are to be selected in Academic Purpose tests, it
appears that the non-science texts may be the most suitable as,
although non-science students seem to be adversely affected by
science texts in tests, the reverse does not appear to be the case.
Most science students appear not to be adversely affected by non
science texts in tests as they are familiar with these areas in their
own reading (Clapham, 1994: 277).
Testing reading comprehension (s) 145
Grabe (1991) found that a major implication of research in
this area is that students need to activate prior knowledge of a
topic before they begin to read. If this is absent then they should
be given ‘at least minimal background knowledge from which to
interpret the text’. It is interesting to note that, in our survey of
teaching tasks used in coursebooks, prediction was seen as a useful
pre-reading strategy. In contrast, in the analysis of testing tasks we
carried out, a pre-reading activity was seldom built in.
Vocabulary
Researchers have attempted to differentiate three levels of vocabu
lary : common core, subtechnical and technical (Inman, 1978;
King, 1989). In tests for heterogeneous populations care should
be taken to avoid technical terms (Robinson, 1991: 28). For higher
level students in particular, we need to examine whether the lex
ical range is appropriate in terms of common core, technical and
subtechnical vocabulary. In EAP tests, where the focus is on lexis,
there is a preference for testing subtechnical words which Cowan
(1974: 391) defines as: ‘Context independent words which occur
with high frequency across disciplines’ (see also Yang, 1986: King,
1989). Marton (1976: 92) sees subtechnical words as academic
vocabulary, ‘the words have in common a focus on research, analysis
and evaluation - those activities which characterise academic work’.
In general, this seems sensible advice but it is not always easy to
determine the level of a word unequivocally and reliably.
Channel
Particularly in science texts, diagrams are extensively used to con
vey information (Ferguson, 1977; Shepherd, 1978). The presence
of diagrams affects the way text is written and processed (Hegarty
et al., 1991; Koran and Koran, 1980). Test developers need to
decide on the nature and amount of non-verbal information that
is desirable, e.g. graphs, charts, diagrams, etc.5
Size
Johnston (1984: 151) notes that currently texts used in reading
comprehension tests tend to be many and brief. The length of
146 Reading in a Second Language
text(s) that candidates are exposed to will influence the strategies
and skills that the candidate may be asked to deploy. If texts
are too short it may not be possible to test expeditious reading
strategies (search reading, skimming and scanning), only careful
reading.
Difficulty
Time control
If the test does not control how much time students spend on
certain items/sections this may change the operations that are
needed to answer them. Too much time spent on a search read
ing question may change it into one that only requires a slow
careful reading. A similar problem might arise in careful reading
tests where too much time on earlier items means that subsequent
questions which demand careful scrutiny have to be answered
more hurriedly, and a candidate is forced into constructing invalid
test-taking strategies to come up with an answer in the restricted
time left.
Careful thought needs to be given to grouping questions into
sections (and most probably using different texts for different
skills and strategies); empirically determining time necessary to
deploy the required skills/strategies; and carefully structuring the
test through rubrics and invigilation so that timings are adhered
to. If more than one passage is used within a section, then time
controls need to be applied here as well or there is a tendency to
spend more time on the earlier passage especially in expeditious
reading. Thus earlier passages might become tests of careful read
ing and only the final passage is processed quickly in the absence
of strict time controls.
User friendliness
How much help is given? A number of factors need to be taken
into account, such as the clarity of the rubrics and whether the
152 Reading in a Second Language
rubrics are in the First Language (LI) or the Target Language
(TL). Shohamy (1984) goes further and suggests that questions
set in LI are easier than in TL and the latter may not give as
accurate a measure of comprehension. In monolingual contexts it
seems logical that candidates might be permitted to write their
answers in their m other tongue as well as having the rubrics and
questions in LI. Lee (1986) suggests that recall of a text was
significantly better when done in LI rather than TL.
Format familiarity
Weir (1993) advises that every attempt should be made to ensure
that candidates are familiar with the task type and other environ
ment features before sitting a test. Sample tests or examples in
test manuals should be available for national examinations, and in
the school context similar formats should have been practised in
class beforehand. Where such help is not available in the pre-test
situation, thought might be given to providing examples at the
start of the test paper if item types are not familiar to candidates.
Anderson and Armbruster (1984: 659) gave support to this when
they argued that ‘performance on the criterion task is a function
of knowledge of the task’.
Cloze
Weir (1990, 1993) describes how the term ‘cloze’ was first popu
larised by Taylor (1953) who took it from the gestalt concept of
‘closure’, which refers to the tendency of individuals to complete
a pattern once they have grasped its overall significance (Alderson,
1978). Johnston (1984: 151) details how it had its earlier roots in
the completion task used by Ebbinghaus in 1897 in his efforts to
find a measure of mental fatigue, for which it proved unsatisfact
ory though its value as a measure of intellectual ability was noted.
In cloze the reader comprehends the mutilated sentence as a
whole and completes the pattern. Words are deleted from a text
after allowing a few sentences of introduction. The deletion rate is
mechanically set, usually between every 5th and 11th word. Can
didates have to fill each gap by supplying the word they think has
been deleted.
Alderson (1978: 39) described how
The general consensus of studies into and with cloze procedure for
the last twenty years has been that it is a reliable and valid measure
of readability and reading comprehension, for native speakers of
English ... As a measure of the comprehension of text, cloze has
been shown to correlate well with other types of test on the same
text and also with standardised tests of reading comprehension.
Cloze tests are easy to construct and easily scored if the exact
word-scoring procedure is adopted. With a 5th-word deletion rate
a large number of items can be set on a relatively short text and
these can exhibit a high degree of internal consistency, in terms
of Kuder-Richardson coefficients. This consistency may vary con
siderably, dependent on the text selected, the starting point for
deletions and the deletion rate employed (Alderson, 1978).
Testing reading comprehension (s) 157
Some doubts have been expressed, especially concerning its
validity as a device for testing global comprehension of a text.
One of its main flaws is that it seems to produce more successful
tests of syntax and lexis at sentence level, comprehension of the
immediate local environment, than of reading comprehension in
general or of inferential or deductive abilities (Alderson, 1978;
Chihara et al., 1977; Kintsch and Yarborough, 1982; Kobayashi,
1995; Markham, 1985). Alderson (1978: 99) found that
. . . cloze is essentially sentence bound . . . Clearly the fact that cloze
procedure deletes words rather than phrases or clauses must limit
its ability to test comprehension of more than the immediate envir
onment, since individual words do not usually carry textual cohesion
and discourse coherence (with the obvious exception of cohesive
devices like anaphora, lexical repetition and logical connectors).
The process underlying successful completion appears to be largely
bottom-up with an emphasis on careful passive decoding at the
word or immediate constituent level. The focus appears to be on
local comprehension at the microlinguistic level rather than global
comprehension of ideas encoded by the writer across the text as a
whole. Bernhardt (1991b: 198) comments:
... it focuses a reader’s attention on individual words to the detri
ment of a global understanding of the tex t... It clearly has little if
anything to do with a reader’s understanding of a piece of connected
discourse.
In reading for academic study purposes it is difficult to see how
it can test the ability to read through a text expeditiously or care
fully to extract main ideas and important detail. Cloze appears to
have little to do with a reader’s understanding of a piece of con
nected discourse (Markham, 1985), measures information only
within clause boundaries (Kamil et al., 1986; Shanahan et al., 1982)
and focuses attention on individual words to the detriment of
global understanding of a text. That such decoding is seen as the
hallmark of a poor reader (automaticity; rapid context free word
and phrase recognition being the hallmark of the fluent reader
according to Carrell et al., 1988: 94-5 and Stanovich, 1981: 262)
may lead us to question its place in either teaching or testing.
Bernhardt (1991b: 197) argues that as far as the construct
validity of cloze as a test of reading is concerned, ‘cloze testing is
profoundly inadequate’.
158 Reading in a Second Language
Multiple-choice questions
In reaction to the earlier ‘pre-scientific’ SAQ tests, an interest
developed in using the more objective MCQ format which still
appears in major international second language tests, e.g. TOEFL
and UCLES examinations, to this day.
Weir (1993) details how multiple-choice tests exhibit almost
complete marker reliability as well as being rapid and often more
cost effective to mark than other forms of written test. The marking
process is totally objective because the marker is not permitted to
exercise judgem ent when marking the candidate’s answer; agree
ment has already been reached as to the correct answer for each
item. The format allows scripts to be machine marked in large-
scale examinations, as in China, where over 2 million take the
College English Test (CET). Selecting and setting items are, how
ever, subjective processes (Meyer, 1985) and the decision about
which is the correct answer can be a matter of subjective judge
m ent on the part of the item writer or moderating committee.
However, even for experienced examiners it is extremely difficult
and time consuming to develop a sufficient number of decent items
on a passage. Items need to be validated through trialling before
we can be confident of their statistical properties, e.g. facility and
discrimination. The development of items for the CET in China
go through a number of rigorous trialling phases, and even so it
takes the national moderating committee ten days to finalise the
papers each year. Given how difficult it is to write such items,
there must be a serious question mark against teachers using this
format to test reading for practical reasons alone.
In more open-ended formats for testing reading comprehension,
e.g. short-answer questions, the candidate has to deploy the skill
of writing. The extent to which this affects accurate measurement
of the trait being assessed has not been established. Multiple-choice
tests avoid this particular difficulty.
With the growth of interest in overall text comprehension as
against decoding in the 1970s, an interest in top-down processing
as against bottom-up decoding, testers also became more aware of
assessing comprehension of text at the global level. A comparison
of the ELBA test with the ELTS test makes this distinction clear. It
is, however, extremely time consuming and demanding to get the
requisite num ber of satisfactory items for a passage, especially for
testing strategies such as skimming in the MCQ format. A particular
Testing reading comprehension^) 159
problem appears to lie in devising suitable distracters for items
testing the more extensive receptive strategies. West (1991: 63)
comments:
. . . while multiple-choice reading items are well able to test isolated
details or ‘fragmentary’ comprehension, they are not very suitable
for more global tests of reading. By ‘global’ reading is meant some
broader response to the text - either comprehension across the
text as a whole (or at least a considerable portion of it) or an
understanding of the text as a text: an appreciation of the charac
teristics of the text type, the intended audience, the writer’s inten
tion, the overall message, or the structure of the text.
Although MCQ items could (albeit with some difficulty) be
written in these areas, such items would seem to inhibit the use of
top-down strategies (skimming, predicting), not least because it
is likely to encourage test takers to try to match the stem and
options with words in the text. Heaton (1988) had noted earlier
that, for global comprehension activities, it is more helpful to set
simple open-ended questions rather than multiple-choice items;
otherwise students will find it necessary to keep in mind four or
five options for each item while they are trying to process the text.
There must be some doubt about the validity of MCQ tests as
measures of reading ability. Answering multiple-choice items is an
unreal task, as in real life one is rarely presented with four altern
atives from which to make a choice to signal understanding. In a
multiple-choice test the distracters present choices that otherwise
might not have been thought of. In MCQ tests we do not know
whether a candidate’s failure is due to lack of comprehension of
the text or lack of comprehension of the question. A candidate
might get an item right by eliminating wrong answers, a different
skill from being able to choose the right answer in the first place.
Nevo (1989) details how test-taking strategies can lead to right
answers for some candidates and reading strategies to incorrect
ones on an MCQ test.
Bernhardt (1991b: 198) raises the issue of passage independ
ence in such tests and cites evidence of candidates being able to
determine answers without recourse to the passage (see also
Pyrczak, 1975; Jarvis and Jensen, 1982; Barnett, 1986). Evidence of
this being a problem is also presented in Katz et al. (1990) .
There is also some concern that students’ scores on multiple-
choice tests can be improved by training in test-taking techniques
and that such improvement reflects an enhanced ability to do
160 Reading in a Second Language
multiple-choice tests rather than any increase in language ability.
This is a matter which is in need of serious investigation.
Carrell et al. (1989) found that the effectiveness of training in
reading strategies varied according to the test format employed.
Metacognitive training led to improvement in the sample whose
ability was measured by open-ended questions but not for those
on MCQ tests.
Weir (1993) draws attention to the danger of the format having
an undue effect on measurement of the trait. There is some evid
ence that multiple-choice format is particularly problematic in
this respect. This has been evidenced by low correlations both with
alternative reading measures and with other concurrent external
validity data on candidates’ reading abilities (see Weir 1983a).
The scores obtained by candidates might have been affected by
the method used. This is not a problem with direct measures of
language ability.
Reading tests in this approach were more concerned with the
psychometric properties of the test than with the nature of the
construct being measured. Thus, in earlier versions of the TOEFL
reading test one could find a section of decontextualised vocabulary
items being used as indicators of reading ability; this is very much
a bottom-up approach to reading and a very limited part of it at
that. As Spolsky (1995: 4) points out: ‘what can be measured reli
ably is not necessarily the same as the ability one is interested in’.
A more recent variant of this technique termed ‘multiple match
ing’ (where the answers to all questions plus a number of distracters
are all provided in the same list for the candidate to select from)
appears in a num ber of recent ELT exams produced by UCLES.
Its advantages are enumerated by West (1991). However, its pro
ponents still do not explain how the underlying processes that
help select the right answer from the many available equates with
normal processing for the reader. It is nevertheless an improve
m ent on traditional MCQ and in those situations where tests need
to be machine scored because of huge populations its potential
should be investigated.
Recall measures
The measures we have discussed so far for assessing comprehension
have been the choice of educators: write questions about informa
tion in a passage and then evaluate readers’ responses to them.
Meyer and Rice (1984: 320) describe how psychologists have tended
to get readers to write down all they can remember from texts.
They point to difficulties in marking such protocols, not least the
difficulties of establishing marking frames, especially where infer
ences are made.
Kobayashi (1995: 111), in reviewing recall protocols, comments:
Recall protocols can be classified as either oral or written in terms
of the language mode, or either immediate or delayed in terms of
time of recall, or either free or probed, i.e. with or without cues for
recalls. First a text is analysed in terms of idea units (or propositions)
and this analysis becomes a template for scoring recalls. The number
of propositions recalled after listening or reading will be counted
as scores.
She notes that the method has not yet gained much ground in
testing but is increasingly common in second language research
studies (see also Lee, 1986, and Lund, 1991). She points to the
difficulty in establishing propositions and a hierarchy of relative
importance within these. The difference between reproduction
and comprehension is also noted. More mature readers who integ
rate ideas and synthesise may be penalised because their protocols
lack details and they may have used their own words.
Bernhardt (1991b: 200-10) proposes immediate recall as an
alternative to traditional testing measures, drawing on her experi
ence in cognitive psychology and LI reading research. However,
at present the analysis of protocols necessary to determine an
estimate of performance would take far too long for this technique
to be feasible for either classroom or large-scale testing, particularly
if the recall protocols are to be written in the native language of
each author.
Developing a marking scheme even on a relatively small passage
can take up to 50 hours (Bernhardt 1991b: 202) and marking
Testing reading comprehension (s) 165
individual protocols, an hour for relatively short passages. She
does, however, suggest a revised and more efficient scheme, based
on breaking the propositions into pausal units, which is quicker
but there must still be some concern that scoring is based on very
small units of information and often single lexical items; this raises
again the central issue of construct validity.
As well as problems in efficiency there is a serious question
mark against the validity of this procedure for testing reading
comprehension. Kobayashi (1995: 113) draws attention to the fact
that readers may not be able to remember all they have understood.
Comprehension is not necessarily equatable with remembering.
3.5 CONCLUSIONS
In Chapter 2 we looked at the theory relating to componential
and process models of reading. We noted that a comprehensive
model of the processing involved in different types of reading is
not yet available and that, for the present, L2 researchers, teachers
and testers might be better served by focusing on the components
of reading ability. In Section 2.2 an a priori case was made for there
being more than one component in reading and a preference was
stated for a three, as against a two, component model. In Section
2.3 we translated these components into terms of skills and strat
egies which are more familiar to testers and teachers.
In Chapter 3 we have examined test-driven research for empir
ical evidence relating to the componentiality issue in terms of what
strategies and skills can and should be assessed. The data emer
ging from such studies offer some tentative but encouraging support
for the theoretical view of the components favoured in Chapter 2.
The rigorous requirements of validation in language testing
have necessitated a closer examination of the parameters of texts
than in Chapter 2, Section 2.2. The need for explicit specification
in testing means that we also have to establish any performance
conditions that may affect the product of reading comprehension
and perhaps even processing itself. We believe that these text-
based facets must form part of any definition of reading objectives
or any definition of reading proficiency. All skills and strategies
are performed under certain performance conditions and not in
the vacuum or text neutral position purely theoretical work some
times assumes. We do not just read expeditiously but rather we
166 Reading in a Second Language
skim/search read/scan a certain type of text, of a certain length,
of a certain degree of familiarity, under certain time constraints,
etc. (see Section 3.3 above). Altering the conditions will alter
performance in comprehending a text and possibly the way we
process the given text.
In terms of the test data presented in this chapter, the argument
as to whether reading is multidivisible, consisting of a num ber of
components which can be identified clearly, or whether it is an
indivisible, unitary process, is still not fully resolved. The ubiquitous
call for further research is necessary. If a unitary view is to be
convincingly rejected, future research will need to demonstrate the
consistent presence of at least a second component in repeated
analyses across a range of samples of ESOL candidates. Secondly,
future research will need to investigate whether such components
are identifiable. Finally, it will have to establish the extent to which
each component has a meaningful effect on the measurement of
reading comprehension. How much of the overall variance does
each component explain in a reading test? It will be important to
use more exigent statistical techniques to test whether the presence
of each component is statistically significant.
There is cause for immediate concern that wholesale adherence
to either the unitary or the multidivisible view in language testing
may be problematic. As a matter of urgency we need to develop
tests which are maximally valid tests of the skill components
at levels A, B, C and D in Table 3.1. If, in constructing tests of
expeditious and careful reading strategies, test constructors faith
fully mirror these in the mapping of texts for testing purposes,
then we might be able to make a stronger case for suggesting
that we are actually testing these skills or strategies. Student intro
spection at the piloting stage might lend further credence to our
efforts. Statistical analysis of data from a normally distributed test
population, in particular principal components analysis, might add
further weight to the success of operationalising the constructs.
A full account of a systematic and principled methodology for
researching the construct of reading through language test data is
presented in Section 5.3.
Such research may not, of course, run as smoothly as we would
like. It may prove impossible to operationalise the posited four
types of reading separately in a test. It may be that reading is such
a massively parallel interactive process that we will not be able to
distinguish clearly between such components. It may be that, at
Testing reading comprehension(s) 167
certain levels of ability - for example, weak and strong readers -
reading is indeed unitary; divisibility may be a function of the
level of student being tested. For readers linguistically proficient
in the target language and already competent readers in their LI,
reading in the target language may well be uni-componential,
whereas this may not be the case where either of these conditions
is not met (see Downing and Leong, 1982).
However, Johnston (1984) rightly emphasises the need to view
validity as the interpretation to be made from test results rather
than residing in the test itself. We must not lose sight of the
emerging evidence that there is doubt about the status of items
that focus on specifically linguistic operations at level D as part of
the assessment of a candidate’s general reading ability. As a matter
of urgency, it is necessary to investigate whether testing D-type
reading does in fact give us sufficient information about a candid
ate’s ability to handle global comprehension tasks A and activities
C. We must address the implications of evidence that there may
be groups of candidates who are capable of type A and C reading
but who are severely challenged by type D test items. There must
be serious concern that test items which focus on the specifically
linguistic/individual word level may not be good predictors of
general reading ability, i.e. they do not give us an accurate picture
of the reading ability of all the individuals who sit a test.
We believe that utilisation-focused tests of reading need to be
based on a clear specification of the target situation needs of
candidates and an attempt will have to be made to identify the
skills/strategies which are needed to carry out their future activities.
A representative sample of those reading types should be incor
porated into the test in a number of different sections with their
own specific configurations of performance conditions. Kintsch
and Yarborough (1982: 834) argue in a similar vein:
It is clearly false to assume that comprehension is an ability that
can be measured once and for all, if only we had the right test.
Instead, ‘comprehension’ is a common sense term for a whole
bundle of psychological processes, each of which must be evaluated
separately. Only a collection of different tests, each tuned to some
specific aspect of the total process, will provide adequate results.
Crucially, a profile of appropriate abilities would indicate whether
or not the candidate is likely to be able to function effectively in
the target language situation in respect of each of the identified
168 Reading in a Second Language
skills/strategies. Everything we have said in our earlier review of
models in Chapter 2 and in the empirical review of test-based
research in Chapter 3 supports the case for profiling. Spolsky
(1994) succinctly adumbrates the complex and multidimensional
nature of comprehension and stresses the need for full description
in reporting results as against a single grade or score. He argues
(1994: 151):
... we will need to design and use a variety of reading assessment
procedures (not only tests) to allow us to report on a variety of
aspects of the student’s ability to understand, and to establish some
systematic way of reporting the results on all of them. The differ
ences the student shows across this range of results will inform us at
least as much as will the result of adding them together. However
good our tests are, a single score will always mislead.
Given the distinct possibility that different skills and strategies
can be taught and tested, and an acceptance that it is worth while
investigating these, then some form of profiling of these abilities
is essential rather than collapsing scores into a single score or grade
for reporting purposes.
Lastly, we have argued that, as well as carefully considering
operations and the conditions under which they are performed,
the test developer must pay due attention to selecting appropriate
formats for assessing performance. Kobayashi (1995) has clearly
demonstrated that the formats used for testing reading com
prehension may well influence performance. Such method effect
should be limited, as far as possible, by the inclusion of a range of
task types which replicate real-life performance (Johnston, 1983)
and which have been shown to be suitable instruments (valid,
reliable and utilisation focused) for measuring posited reading
skills and strategies.
Notes
1. We must be careful not to fall into the trap of thinking that these
two types of variation are separable. Clearly the dimensions intersect:
readers’ careful reading may be expected to produce a different product
from their expeditious reading, but their careful reading may also
produce different interpretations from other readers’ careful reading.
2. It will be recalled that Hoover and Tunmer adopt this course when
describing the components of reading ability.
Testing reading comprehension (s) 169
3. Testers’ assumptions can so easily be thwarted. One of the authors
once set a JMB test apparently requiring a detailed background know
ledge of horse-shoes to a group of British adults. One student performed
brilliantly, only to confess that in his youth he had passed a lot of time
in a farrier’s workshop.
4. As pointed out in Chapter 2, Kintsch and van Dijk’s description is best
suited to careful reading.
5. The provision of such a categorisation should not be taken as pre
empting the question as to whether multicomponent categorisations
are valid.
This page intentionally left blank
171
4
Similarities
Both testing and teaching involve students being given a written
text or texts, and being required to read it. Usually, they are also
expected to respond overtly to some task requirem ent (though in
classroom ‘silent reading’, this requirement may be dropped).
Apart from such an obvious overlap, the following similarities can
be discerned.
We noted in Chapter 3 that a feature of the British approach to
communicative testing is that the tasks and the conditions under
which they are performed should approximate to real life per
formance as closely as possible (McNamara, 1996; Weir, 1993).
We feel that the same should apply to the teaching of reading, at
least as far as comprehension activities are concerned.
Consideration of the types of task discussed for testing reading
in Chapter 3 and the performance conditions under which they
are performed, such as
■purpose
■nature of the texts
■length of text
■rhetorical structure
■topic area
■background knowledge
■w riter/reader relationship
■speed of processing
■range of vocabulary
■grammatical complexity
are of equal relevance in the reading classroom in determining
activities and selecting texts.
For example, as far as purpose and texts are concerned, read
ing genuine texts for authentic purposes is held to be crucial in
motivating learners to read (Nuttall, 1996). In both teaching and
testing, coping with genuine text is likely to be an important
objective at a certain stage for many learners. A key objective for
both the materials writer and the test developer should be to give
the students a realistic purpose for every reading activity (see Moran
The teaching of reading 173
and Williams, 1993: 68). Personal interest may be difficult to cater
for in a course book or a test, but instrumental purposes are
relatively easy to simulate in information-giving texts (see Paran,
1991 and 1993, for examples of this). Such purposes enable the
activity to move beyond the ubiquitous post-text comprehension
question(s) demanding a ritual show of understanding (ibid.).
A feature of modern textbooks on reading is to try to provide a
clear purpose for reading a text and Paran (1993) provides some
nice examples of how this might be set up through pre-reading
activities such as: an initial questionnaire followed by reading to
compare findings; a quiz and a reading to check how much you
know; prediction of content from title, words, illustrations, etc.,
and checking text to see how right you are; discussing own opin
ions and comparing with opinions in the text.
The question of text length is another aspect shared by testing
and teaching. It is clear that people learn to read by reading not
just by doing exercises. Learners must therefore read enough in a
programme for it to make a difference (see Mahon, 1986). So just
as longer texts are necessary in tests where the interest is in ex
peditious reading, so too in learning to read. The tendency to
employ short texts in tests and course books meant to cover the
range of reading skills and strategies is questionable (for example,
note the limited length of texts in McGovern et al., 1994). Obvi
ously, in a teaching situation the length of texts that it may be
possible to use is far greater, especially if out-of-class work can be
given. At this point differences between teachers and testing
begin to emerge.
Activities and performance conditions discussed in Chapter 3
are criterial for both teaching and testing; it is in the use made of
the data generated by the reader interacting with these that differ
ences emerge.
Differences
It is universally accepted that, in testing, reliability of measurement
is of crucial importance. Any factor that reduces reliability must
be isolated and, if possible, eliminated or at least minimised. In
the teaching situation, however, reliability of measurement is far
less important. This is the major difference between testing and
teaching, from which all the other differences, set out below, derive.
174 Reading in a Second Language
Interpretations
We have argued that traditional testing aimed at forming a reli
able estimate of an individual reader’s performance. In order to
achieve this, a consensus as to what constituted an agreed stand
ard performance in a particular context had to be agreed in
advance by the testers. Alternatives to this standard had to be
eliminated. In other words, the test had to be constructed in such
a way that for each task there was a right answer. Scenarios, even
if reflecting situations in the real world, where either there was no
agreed right answer (as in reading for enjoyment) or where differ
ent answers were arguably equally valid, had to be avoided.
In Chapter 2 we discussed different styles of reading and con
trasted the submissive versus the dominant reader. We argued in
Chapter 3 that in testing there was little room for dominant read
ing or challenging the texts as this would certainly defeat any
attempts to assess such activities reliably. In addition, no account
could be taken of the way in which the response offered had been
arrived at. In other words, testing at present is concerned with
product, and not process. Teaching, on the other hand, since
reliability of measurement is not as important, can take differing
products, interpretations, into account. Also, since it must be con
cerned with how one goes about solving a reading task, it is also
permitted to encompass process.
Thus, reader-specific responses to text are possible and to be
encouraged in the teaching situation, whereas for reasons of test
reliability they were excluded from consideration in testing (see
Section 3.1). In learning to read, activation of relevant schemata
is seen as a key part of the reading process. Pragmatic inferencing
(Chikalanga, 1992) is also viewed as an important aspect of the
reading classroom, particularly at more advanced levels (see Sec
tion 4.2 below for a discussion of this type of reading, and Nuttall
(1996: 121 and 167) for discussion of classroom procedures and
exemplification). Inferencing may indeed promote effective learn
ing (Pearson and Fielding, 1991).
Training
Recent textbooks (Paran, 1991, 1993) build in cognitive and meta-
cognitive training as a central part of each unit. Their exercises
go beyond mere answering of comprehension questions and
attempt to teach strategies for coping with texts at the pre-reading,
The teaching of reading 175
while-reading and post-reading stages. Sometimes learning is paid
lip service in testing tasks, but this is not common. It is this devel
opmental rather than accountability function which distinguishes
summative testing from teaching.
Formative testing in the classroom is different, however, in that
the purpose, the use to which the results are put, should be dia
gnostic. Teaching cannot proceed without reliable information
on what students can or cannot do. Formative testing has a crucial
role in providing such data for developmental purposes and, as
such, the distinction between teaching and testing tasks is blurred
in such formative use of tests.
Tasks
We have already said that testing is fundamentally concerned with
tasks. The testees must perform overtly, in response to a task set,
and their performance is assessed. Tasks, of course, occur in the
teaching classroom too, but there are differences. While there are
numerous definitions of tasks in teaching (Nunan, 1989, 1993;
Candlin and Murphy, 1987; Crookes and Gass, 1993a and b; Skehan,
1996; Skehan and Foster, 1995) we are happy to take Williams and
Burden’s (1997: 167) simple description as our working definition
for teaching purposes:
Basically, a task is anything that learners are given to do (or choose to do)
in the language classroom to further the process of language learning.
We would want to extend tasks to include such activities outside
the classroom and limit it for our purposes to reading, e.g. read
ing a book at home for pleasure. We would also see formative test
tasks as falling under this umbrella. We also accept that N unan’s
elements of a task - input data, activities, goals, role of learners
and role of teachers - are all important and all interact with each
other (Nunan, 1993). Consideration of one will necessarily involve
some consideration of the others. For ease of description the role
of the learner and goals are considered in Section 4.2; input data,
i.e. texts and activities, are discussed in Section 4.3; and we examine
the role of the teacher in Section 4.4.
Staging tasks
Unlike the test, in the reading classroom tasks may well be broken
down, staged or scaffolded to help the less able reader. The teacher
176 Reading in a Second Language
provides help to enable students to complete tasks they would not
be able to do on their own. In contrast to the driving test itself, in
learning to drive one would not expect to do everything in a
single lesson, at least not in the early stages. So too with reading;
at some stage it might be necessary to focus on certain strategies
or skills, or analysis of sentence functions or text structures (see
Nuttall, 1996: 100-24).
Microskills
There may well be a necessity to bring some students up to a
threshold level of linguistic ability whereby they are enabled to
establish, expeditiously or carefully, the macrostructure of a text.
Activities promoting global comprehension may not be sufficient
for this. Learning the important skills of word recognition and
decoding may involve less direct, less global activities. Whereas in
testing we have suggested that successful expeditious and careful
reading for global comprehension must indicate a minimally
adequate knowledge of lexis and structure, and thus make testing
at the word level unnecessary except for diagnostic/placement
purposes, the teaching situation is different. We may have to pro
vide the opportunities to practise activities promoting word recog
nition and decoding skills in reading classes although, as such,
they may never appear in proficiency or achievement tests of
comprehension.
Co-operation
We noted above that in summative testing the teacher and fellow
students are removed from the interaction, and the help that can
be provided by both in learning to read is not available to the
student. The interest in the test situation is in what the student is
capable of comprehending unaided. In testing, some students
might be expected to fail; whereas in teaching, the agenda is to
try to ensure that nobody does.
In the test situation the reader is isolated from contaminating
sources, such as help from other students or from the teacher, in
an attempt to measure his or her ability in a construct unmuddied
by other influences. In the reading classroom pedagogical input
exists in terms of instruction and mediation that is absent from
the test context: i.e. advice on strategies and skills, practice in
The teaching of reading 177
their use and discussion of their value. The aim here is to bring
about understanding rather than just measuring it in a statistically
reliable fashion.
The agenda of the classroom is more formative, co-operative
and developmental. Thus the methodology, the activity of the
reading classroom, can be wider and richer. The tasks available
for learning to read are more diverse and may involve working
with others, students or teachers, in both pairs and groups. The
learners are not being asked to demonstrate how well they can
use strategies or skills but rather to develop and improve their use
of these. The tasks employed may be similar in the reading lesson
to those of the test but differ in the way they are used.
In the classroom, comprehension questions set on texts are
often done in groups which promotes effective discussion con
cerning how the answer was arrived together with feedback for
individuals in a non-summative manner. By verbalising about their
own reading they come to understand better the processes involved.
Nuttall extols the value of buzz groups (1996: 201): small groups
work on the task for a short period, and report back in plenary
followed by whole group discussion.
Conclusion
While there are a num ber of similarities between teaching and
testing there are also marked differences which necessitate con
sidering teaching separately.
To teach or not
A car sticker in Britain carries the message: 'If you can read this,
thank a teacherThe implication is clear: if you are not taught to
read, you will not learn. It is, however, perfectly likely that some
LI children, at least, learn to read with little or no formal teach
ing, and it is certainly the case that L2 readers learn to read the
second language without formal instruction. W hether explicit
instruction is any more effective than simply encouraging students
to read and form their own rules remains unproven. We would,
however, agree with Pearson and Fielding (1991), in their excel
lent review of comprehension instruction in LI, that the danger
of the non-interventionist approach is that the good readers get
better and the poor do not. The gap widens.
In some classes students will not lack ability in the skills and
strategies discussed below, and until problems become evident in
these areas they may well be avoided. The time to practise these
skills and strategies is in response to needs or lacks that become
evident.
What is also clear is that comprehension teaching effectiveness
may differ from context to context and there are no generic class
rooms (Bernhardt 1991a: 173); those of young bilinguals will be
The teaching of reading 179
very different from adult second language readers preparing for
postgraduate study in an overseas context. Because of the great
diversity of reading contexts we shall not be providing multiple
examples of reading activities and texts; there are stimulating col
lections of these already available to which we would direct the
reader (Grellet, 1981; Nuttall, 1996; Wallace, 1988; Williams, 1984)
and a plethora of reading textbooks aimed at specific audiences.
We will instead focus on the evidence from principled reading
research and instruction to formulate generic suggestions for teach
ing. We will attempt below to determine what research indicates
as being the most productive activities across diverse situations.
The discussion below is therefore for the most part in terms of
principles rather than commentary on specific examples of class
room tasks. Reference will, however, be made to sources the reader
can consult for practical exemplification.
Pre-reading activities
Previewing
Previewing can be used to make a decision whether to read a
book, an article or a text. Where appropriate to text type it might
involve:
■thinking about the title
■checking the edition and date of publication
■reading the table of contents quickly
■reading appendices quickly
■reading indices quickly
■reading the abstract carefully
■reading the preface, the foreword and the blurb carefully.
Hamp-Lyons (1984: 305) adds that previewing helps students re
cognise the difficulty level of a text and comparative difficulty
with other texts in the same field, helps them judge the relevance/
irrelevance of a text for a particular topic, and helps them decide
which book from a set of possibilities would be more appropriate
to read for a specific purpose. Its value for teaching is the amount
of time it might save if it prevents prolonged reading of some
thing of no value (see Nuttall, 1996: 45-8).
It is of particular use in deciding whether textbooks or parts of
a textbook are of value, though browsing through a novel at the
airport bookshop before deciding to purchase is another manifesta
tion. The reason that it seldom features in tests is on the grounds
of efficiency and reliability. It is difficult in the exam situation to
provide the same textbook(s) for large numbers of candidates.
Additionally, the num ber of items that can be usefully written are
often limited with implications for test reliability. Similarly, how
would one evaluate xs decision to purchase a particular book?
However, in the classroom context, previewing may be very
useful, particularly for English for Academic Purposes Students.
Previewing has obvious links with expeditious reading strategies,
particularly skimming for gist, discussed in Section 4.3 on cognit
ive strategies.
The teaching of reading 185
For exemplification of previewing, see Grellet (1981: 58-61),
Hamp-Lyons (1984) and Trzeciak and Mackay (1994: 5-10).
Prediction
After taking the decision to read a text, this strategy is used to
anticipate the content of a text; to make hypotheses about the
macropropositions it might contain. It is a form of psychological
sensitising, thinking about the subject and asking oneself related
questions.
In theoretical terms it accords with the hypothesis advanced
earlier in Section 3.3 that establishing a macrostructure for a text
is an aid to more detailed comprehension. One might also hypo
thesise that the activation of relevant schemata should facilitate
the reader’s interaction with a text (see Section 2.2). Finally, this
activity has the potential to clarify for the reader what the pur
poses for reading the particular text might be.
It is often a case of supplying or activating appropriate back
ground knowledge, and this might best be done through pre-
reading activities: lectures, discussion, debate, real-life experiences,
text previewing or introduction of vocabulary. It makes use of
top-down processing to activate different kinds of schemata in
common with many pre-reading activities.
Haines (1988) uses surveys and questionnaires to encourage
discussion and activate and build up background knowledge pre-
reading, and Paran (1993) uses surveys to similar effect. Tomlinson
and Ellis (1988) offer a range of pre-reading activities aimed at
activating formal knowledge of text.
Williams and Moran (1993: 66) suggest:
Perhaps the most effective of these activities are those which elicit
factual information or a personal response and ask the students to
pool such information in pair or group work. Preferably, this is
followed by a task which relates the discussion to the first reading
of the passage.
In comparing teaching and testing tasks currently in use it is notice
able that, whereas prediction activities are now a common feature
in textbooks on the teaching of reading, they seldom feature in
tests. Part of the reason for this is presumably that such data do
not lend themselves easily to assessment. The open-ended and
idiosyncratic nature of such prediction, based as it is on already
existing constructs, is obvious.
186 Reading in a Second Language
Williams (1984: 36-51) provides useful advice and examples of
pre-reading activities that can be employed in the language class
room. Swaffar (1981) and Carrell and Eisterhold (1988) describe key
word and key concept pre-reading activities. See also Glendinning
and Holmstrom (1992: 20-4), Langer (1981) and McGovern et al.
(1994: 11-12) for further exemplification.
While-reading strategies
Self-questioning
This is identified by research as a characteristic of good reading
when it promotes cognitive processes such as inferencing, monitor
ing understanding and attending to structure. Alvermann and
Moore (1991: 961) detail how ‘generally, instruction in self ques
tioning improves student processing of text’ and note that ‘poorer
readers tend to benefit most from such training. Scaffolding of
instruction leading to gradual control appears to be beneficial.’
Nuttall (1996: 37) describes this activity as interrogating texts; text
talk. For students unfamiliar with this activity the teacher inter
rogating the text aloud can provide a valuable example, particularly
where the focus is put on important problematic aspects of a text.
Organised methods involving self-questioning have been in use
for some time (see Nuttall, 1996: 129; and Richards, 1989 for de
tails of SQ3R). Palincsar and Brown (1984) focused on teaching
summarising, questioning, clarifying and predicting skills, arguing
that these activities, if engaged in while reading, enhanced compre
hension, and, at the same time, gave the student the opportunity
to monitor whether comprehension was succeeding.
Selfmonitoring
Monitoring one’s own comprehension - checking that compre
hension is taking place and adopting repair strategies when it
isn’t - is seen as a hallmark of skilled reading. It is important that
students are aware of how various strategies will help them. Self
verbalisation was also seen as important (Pearson and Fielding,
1991: 838).
The connection with schema theory is clear: by asking them
selves whether they understand, learners are asking whether it fits
The teaching of reading 187
in with what they know already. Thus they learn how to under
stand what they read in the process of learning how to monitor
their comprehension (Pearson and Fielding, 1991: 847).
Alvermann and Moore (1991: 962 et seq.) sound a note of
caution in that many of these studies were achieved under experi
mental rather than field conditions with consequent threats to
their ecological validity: they were decontextualised; an experi
m enter rather than the normal classroom teacher introduced the
intervention; texts were specially prepared and were often shorter
than normally met texts; and students were often not prepared
in the use of the strategy before the intervention. They argue
(p. 974) that we need to develop a research methodology which
would actively involve teachers and carefully document baseline
data of the situation preceding the decision to implement an
innovation and also collect data to monitor its effect.
Post-reading strategies
Automaticity
We pointed out in Chapter 2 that a strong criticism of so-called
top-down models was that they attributed too much importance to
hypothesising, or guessing, whether of lexical items or larger units.
Stanovich (1980) points to the implausibility of hypothesis testing
being of much value to the skilled reader as it take so much less
time to recognise a word than to go through a complex guessing
game. In one of the most important contributions by cognitive
psychologists to reading research (and potentially the teaching of
reading) in recent years, it was repeatedly found that good readers
used context much less often than poor readers when recognising
printed words. In fact they appear to be able to recognise words
without any conscious thought, i.e. at the automatic level.
Juel (1991: 771) cites important evidence that early attainment
of decoding skill/word recognition is a very accurate predictor of
The teaching of reading 191
later reading comprehension in LI children. Those who do poorly
in the first year of learning to read are unlikely to improve their
position as compared to those who do well. Poor decoding skill
may delimit what the child can read and the differences are further
compounded by out of school experiences.
Automaticity in L2
The increased importance attributed to automatic word recogni
tion in LI reading has been extended, though with less empirical
support, to the L2 reading area. Previously in L2, a great deal of
faith had been placed on decoding by means of context. Haynes
(1984), however, points out that we need to get the level of
automaticised vocabulary up rather than focusing on decoding in
context. Haynes points out (48):
Rapid precise recognition of letters and words, that is, bottom up,
more input constrained processing, must be mastered before fluent
reading can take place.
She cites evidence from LI studies that fluent reading is achieved
by increasing one’s bottom-up processing of print and decreasing
semantic and syntactic guesswork, though this is not as yet proven
for L2. She questions the emphasis given in textbooks to guessing
the meaning of unknown words from surrounding context as
the main approach to learning vocabulary. Context often proved
inadequate to support accurate inferencing and encouraging the
guessing from context strategy might well lead to frustration in
these cases.
There is some negative evidence for this position in the L2
area. Bensoussan and Laufer (1984) found no evidence that bet
ter readers are able to use context more effectively for lexical
guessing than less proficient students. More crucially, they argue
that in many cases only a minority of word meanings can be recov
ered from the context. Working out the meaning of words in
context is only a part of the vocabulary skills needed for fluent
reading and it appears that it may actually interfere if a student
over-relies on this strategy.
Beck (1981) argues that ‘basic recognition exercises to improve
speed and accuracy of perception may constitute an important
component of an effective second language reading programme’.
192 Reading in a Second Language
If, as appears likely, automatic word recognition is more import
ant to fluent processing of text than context clues, the large-scale
development of recognition vocabulary may be crucial to reading
development (van Dijk and Kintsch, 1988; Perfetti, 1985). Poor
readers have simply not acquired automatic decoding skills. Poor
readers spend too much processing time thinking about words
and relating them to the surrounding context, rather than auto
matically recognising them.
Bernhardt (1991a: 235-6) argues that the ultimate goal is auto-
maticity. Good LI readers process language in the form of written
text without thinking consciously about it, and good L2 readers
must also learn to do so. It is only this kind of automatic process
ing which allows the good reader to think instead about the larger
meaning of the discourse - on the one hand, to recover, the
message that the author intended to convey and, on the other, to
relate that new information to what the reader knows and feels
about the subject, and to his or her reasons for reading about it.
In short, it is only this kind of local processing that allows for
global meaning with true comprehension.
L2 reading speed
Bernhardt (1991a: 234) argues that a major bottom-up skill is
reading as fast in that language as their knowledge of it will allow,
in relation to their reading purposes. Where appropriate we need
to dedicate some time for rapid identification of lexical and gram
matical form. Juel (1991: 771) quotes Chall (1979): ‘. . . (learners)
have to know enough about the print in order to leave the print.’
It is noted in the literature that L2 readers often read texts
more slowly than LI readers. One of the most striking differences
between LI and L2 readers of English texts is their speed. Haynes
(1984: 50) and others have identified the root of the problem as
the length of the fixation slowing down the reading rather than
num ber of fixations or regressions. Haynes (1984: 50) notes:
There is no clear experimental evidence explaining these longer
visual fixation times, but a strong possibility involves the time re
quired for lexical access, that is the time it takes for a reader to
match the printed word to a word meaning in memory.
It seems likely that it takes longer to access lexical meanings,
remember what a word means, in L2 than it does in LI. L2 readers
The teaching of reading 193
of English do not have large well-practised vocabularies and years
of experience of recognising words in print. Hence, it takes them
longer to decide whether a word is known or unknown and, in
the latter case, whether to skip it or not.
Text selection
Williams (1984: 15) discusses the shortcomings of the types of text
used solely for learning language and sees, as its key failing:
There is litde attention to reading as a skill in its own right that
might need to be developed in different ways for different purposes.
He concludes (p. 125):
. . . although it is very tempting to use written text as a basis for
the learning and teaching of language, an approach that goes no
further not only neglects reading as a skill but also neglects the
ultimate purpose of learning a language which must surely be to
use that language. Being able to read skilfully and flexibly is an
important use of language.
Nuttall (1996: 30) argues in a similar vein for focusing on using
texts to convey meaning rather than as a convenient vehicle for
conveying language:
partly because this is often neglected in the language classroom,
partly because treating texts as if they meant something is more
effective in motivating students and promoting learning.
She offers a number of criteria for text selection including:
■suitability of con :ent: it is essential that text should interest the
reader;
■exploitability: facilitation of learning. How well can it help de
velop reading ability; this is not a language lesson or a content
lesson but rather ‘how language is used in conveying content
for a purpose’ (p. 172).
The teaching of reading 205
Appropriate texts, for example in terms of:
■intended audience
■intended purpose
■source
■length
■lexical range
■rhetorical structure
■topic familiarity
■relationship to background knowledge
■channel of presentation
should be chosen to enable students to practise careful reading.
The same selection criteria apply to the other strategies and skills
we discuss below. It seems that careful reading can accommodate
implicit text structure and ideas, whereas expeditious reading is
more dependent on explicitness in text structure or ideas. This is
what Hamp-Lyons (1984) called a ‘text strategic approach’. The
focus here is on exploiting generalisable features of text ‘in order
to help learners develop skills for approaching any text’.
This is an area in critical need of attention by reading teachers.
What are the salient features of text selection which will facilitate
selecting texts to best practise appropriate activities? What is a
principled set of procedures to determine whether texts appropri
ate in terms of the above conditions actually allow the practising
of intended activities, purposes for reading. An attempt to draw
up a specification for text selection for an advanced reading test
in China is presented in Appendix 1. The same categories of
description are applicable to text selection for teaching purposes.
In addition, in Section 5.3 a text-mapping procedure is described
which offers a systematic method for the development of tasks
once a text appropriate for the intended purposes of reading has
been identified.
It is also important that students are exposed to the range
of materials they might later have to cope with for either in
formational or entertainment purposes. For example, it is no
use basing EAP reading materials solely on texts taken from
newspapers, though obviously the introduction of target texts
will only occur when it is appropriate to do so both in terms of
background knowledge and linguistic readiness. Hamp-Lyons
(1984: 308) cautions:
206 Reading in a Second Language
. . . our readings in schema theory. . . convinced us of the need to
choose texts which the students would be easily able to integrate
with their own prior experience and knowledge of the world.
Authenticity was also discussed in Chapter 3 with regard to test
ing. Nuttall (1996: 177) argues:
To pursue the crucial text attack skill we need texts which exhibit
the characteristics of true discourse: having something to say, being
coherent and clearly organised. Composed (i.e. specially written)
or simplified texts do not always have these qualities.
This is not to say that texts may not be modified with due caution
(see Lewkowicz, 1996). For example, difficult words can be substi
tuted or complex syntax unravelled. Williams and Moran (1993:
66) note that the claims for authenticity are not taken as literally
as they once were and simplified or specially written texts have a
place in the reading course books they reviewed. Lewkowicz (1997)
makes the point that as long as salient performance conditions,
e.g. appropriate rhetorical structure, are present full authenticity
may not be essential in the texts employed for teaching or testing
specified skills and strategies.
Williams (1984: 18-19) makes a number of points about the
linguistic difficulty of the text selected:
... it should not contain a large amount of language that is too
difficult for most of the class ... if too difficult, then either the
pace of the lesson will be slow, and boredom will set in, or the pace
will be too fast, and the learner will not understand enough, and
frustration will result.
Nuttall (1996: 174-6) deals with this under the heading of read
ability (see also discussion of text difficulty in Section 3.3). She
sees it as a combination of structural and lexical difficulty though
recognising the influence of conceptual difficulty and interest.
Texts selected should take the level of the students in terms of
vocabulary and structure into account. In multi-level classes, self-
access work at different levels may be an essential supplement if
the provision of differentiated reading materials is not available
for regular classroom instruction (p. 174).
Nuttall (1996: 36) talks of the ‘next step’ level, i.e. one step
further than where the student currently is, but no more, as the
target for pushing them on. The teacher provides ‘scaffolding’ to
help them take this extra step. Nuttall describes this as never
The teaching of reading 207
doing anything for them that they can do themselves with a little
support. This is discussed by Williams and Burden (1997: 65-6) as
the zone of proximal development from the field of educational
psychology:
... it suggests that the teacher should set tasks that are at a level
just beyond that at which the learners are currently capable of
functioning, and teach principles that will enable them to make
the next step unassisted. Bruner and others have used the term
‘laddering’ to refer to this process.
Williams (1984: 34) also advocates using a range of materials,
selecting texts ‘that deal with the same topic or theme, since this
will result in consolidation and extension of language and lan
guage use in a way that is comprehensible to the general learner’.
Vygotsky (1962: 78) is one of the earliest writers to deal with
mediation in the sense of using tools to achieve goals, and his
work previews much of the current discussion in this area.
If learners can choose their own texts, this is likely to be highly
motivating; but in those cases where textbooks are prescribed, this
may not be possible, and how the teacher uses texts becomes
crucial. Walker (1987) offers a proforma set of activities for stu
dents who bring their own texts to the classroom so that even
though the instruction is individualised in terms of text the activ
ities being practised are common.
In addition to work focusing on the careful reading activities
detailed at the start of this section, there are a num ber of other
interventions which should help to ground these skills properly.
Text organisation
Students with varied profiles appear to benefit when teachers help
them activate or build formal knowledge of text structure: struc
tural relations between main ideas in a text. Pearson and Fielding
(1991) provide examples of story structure and expository text
structure instruction. They describe ways of activating knowledge
of the structure of the text itself, e.g. of a story grammar. This
might involve consideration of its abstract hierarchical structure:
setting, problem, goal, action, outcome; and giving practice in
identifying category relevant information.
Comprehension, particularly inferential comprehension, is also
helped when connections are made between readers’ background
knowledge and experience and the content of the text under
review (Pearson and Fielding, 1991: 847). This may happen prior
to reading. Invoking knowledge structures aids comprehension.
Making predictions before reading and confirming them during
reading, and asking inference questions during and after reading,
improves comprehension - particularly inferential comprehension
(ibid.).
As well as formal knowledge of text structure, knowledge about
specific topics and themes related to a story is important. The
role of pre-reading discussions to generate expectations in this
respect has been shown to be effective (Pearson and Fielding,
1991: 822). Other methods include using writing to anticipate
story information and developing a short list of key words, and
such a cognitive engagement has been found to help poor readers
(p. 823).
Longer texts, or a number of texts on the same theme, are seen
by Williams and Moran (1993) as another way in which authors
have tried to build up background knowledge in a certain area
(see Haines, 1987, and Tomlinson and Ellis, 1988, for examples
of these).
Additionally, inferential questions and prediction questions - a
focus on important ideas (central events in a story), on construct
ing an interpretation and on summary - are seen by these reviewers
as useful techniques for improving the understanding of a story.
The teaching of reading 209
As far as expository text structure is concerned, it has been
suggested that visual summary is a useful tool. What was said earl
ier in Sections 2.2 and 3.3 about the organisational structure of a
text is relevant here. The importance of summary (see below),
schematic representation of a text, and rating the importance of
ideas related to the text to comprehension, learning and remem
bering, are noted by Pearson and Fielding (1991: 827) as they
promote attention to text structure. The effectiveness of teaching
students to use text structure to identify main ideas is confirmed
by Alvermann and Moore (1991: 960), though they point out that
students’ familiarity with the topic appears to mediate instructional
effectiveness.
It is clear that readers who are knowledgeable about, and who
can follow the author’s text structure, recall more of a text than
those who lack these attributes (Pearson and Fielding, 1991: 827)
and they note that more good readers than poor follow the writer’s
structure in recall of texts. Hierarchical summaries using discourse
clues and visual representations (networking, flowcharting, con
ceptual frames) are also seen as useful in helping recall text infor
mation better and improving comprehension, particularly for lower
ability students who need more help in developing strategies.
Nuttall (1996: ch. 12) offers a variety of information transfer task
examples that might be used in the reading class. Also the section
on information transfer in Section 3.4 on testing offers advice on
the use of this task type.
Pearson and Fielding (1991: 832) conclude:
It appears that any sort of systematic attention to clues that reveal
how authors attempt to relate ideas to one another or any sort of
systematic attempt to impose structure upon a text, especially in
some sort of visual representation of the relationships among key
ideas, facilitates comprehension as well as both short term and
long term memory for the text.
It appears that while most strategies are of value across the ability
range:
. . . the more able readers benefit the m ost. . . (but) regardless of
ability level, the teaching strategies have their greatest effect when
students are actively involved in manipulating conceptual relation
ships and integrating new information with old knowledge.
(Alvermann and Moore, 1991: 960)
210 Reading in a Second Language
Nuttall (1996: 100-24) provides useful advice and sound practical
exemplification of a range of text attack skills: recognising func
tional value of sentences; recognising text organisation; recognis
ing presuppositions underlying a text; recognising implications;
and making inferences. A particularly interesting example is where
students are given parts of a chapter or of a text and they have to
put the parts in the right order. This is best done in groups. It
involves an integration of many of the skills and strategies dis
cussed in this chapter.
Creating text diagrams to illustrate the way ideas and informa
tion are presented in a text is probably best done by the students
working in groups with classroom discussion later. Not all texts
lend themselves to this technique, so input texts need to be cho
sen carefully. The section on mindmapping in Section 5.3 offers
some insights into how students might go about this process and
learn something more about skills and strategies at the same time.
Nuttall notes of text diagrams that
Their great advantage which outweighs the disadvantages ... is that
they demand close study of the way the text is put together and
promote text focused discussion. They are useful either to display
common patterns of paragraph organisation or to elucidate the
structure of complex text. (p. 109)
Useful discussion of networking can be found in Danserau et al.
(1979); flowcharting in Geva (1980, 1983) and for work on top
level rhetorical structures see Meyer (1975) and Bartlett (1978),
who show how, through diagram, ideas and their relationships are
represented within the text. Williams (1984) provides a useful
basic survey of text structure and some ways of introducing it to
students.
Careful reading into writing,: a product from
the reading process
Summarisation is perhaps the verbal equivalent of the visual dia
grammatic representation of text structure discussed in the previ
ous section, which could easily be subsumed under the broad
umbrella of summary. In contrast to earlier work on summarisation,
where research results were confounded by the use of low-level
multiple-choice items as criterion scores, Pearson and Fielding cite
positive support for summarising including improved comprehen
sion on the texts involved, increased recall and even improvement
The teaching of reading 211
on standardised reading test scores by students involved in this
activity (1991: 833). There is also evidence that summarisation
training transfers to new texts. They argue for its value as a broad-
based comprehension training strategy.
Students understand and remember ideas better when they have to
transform those ideas from one form to another. Apparently it is
in this transformation process that author’s ideas become reader’s
ideas, rendering them more memorable. (p. 847)
In Chapter 3 we discussed the aim in testing of measuring reading
unmuddied by the contaminating influence of other variables,
e.g. writing. Measurement considerations such as this do not loom
as large in the teaching situation. A good case can be made for
the fruitful interaction between reading and writing in the lan
guage classroom, both activities being seen as potentially comple
mentary to each other. Zamel (1992) argues that reading and
writing instruction benefit each other in an integrated approach
and argues for ‘writing one’s way into reading’. Silberstein (1994:
70-1) argues that by integrating instruction students come to un
derstand the way in which both readers and writers compose text.
Smith (1988: 277) comments that ‘writing is one way of promot
ing engagement with a text which leads to better comprehension’.
The student has to establish the main ideas in a text, extract
them and reduce to note form and then rewrite the notes in a
coherent m anner in their own words. Brown and Day (1983) iden
tified a num ber of rules for summarising which match the rules
of Kintsch and van Dijk for establishing macropropositions (see
Chapter 2, p. 80):
■delete trivial information
■delete redundant information
■provide a superordinate term for members of a category
■find and use any main ideas you can
■create your own main ideas when missing from the text.
Pearson and Fielding (1991: 834-5) report that such training
enhanced summarisation and increased scores on reading tests
when compared with control groups. Exemplification of summary
tasks can be found in Grellet (1981: 233-6), Paran (1991: 5, 27,
41) and Trzeciak and Mackay (1994: 26-8, 33-55).
We expressed concern about what actually happened as regards
reading in the L2 classroom at the start of this section. The evidence
212 Reading in a Second Language
suggests that little attention is devoted even to teaching the skills
and strategies necessary for successful careful reading for global
comprehension of text. The situation may be even worse as regards
expeditious reading strategies. Leaving aside the prevalence of short
texts in most course books, less attention is devoted to these strat
egies in comparison with careful reading.
Skimming
This involves processing a text selectively to get the main idea(s)
and the discourse topic as efficiently as possible, which might
involve both expeditious and careful reading and both bottom-up
and top-down processing. The focus may be global or local and
the rate of reading is likely to be rapid, but with some care. The
text is processed quickly to locate important information which
then may be read more carefully. Purposes for using this strategy
might include:
■to establish a general sense of the text
■to quickly establish a macropropositional structure as an outline
summary
■to decide the relevance of texts to established needs.
Where appropriate to text type it might involve one or more of
the following operationalisations:
■identifying the source
■reading titles and subtitles
■reading the abstract carefully
■reading the introductory and concluding paragraphs carefully
■reading the first and last sentence of each paragraph carefully
214 Reading in a Second Language
■identifying discourse markers
■noting repeated key content words
■identifying markers of importance
■skipping clusters of detail
■glancing at any non-verbal information.
Readers would be taught to be flexible as not all strategies would
work with all texts. Also, some attention might usefully be paid to
metacognitive strategies discussed above such as prediction and
monitoring; the former to facilitate the use of existing knowledge,
the latter to help separate less important detail from main ideas.
Practical exemplification can be found in Grellet (1981: 71, 73-5,
81-2), Paran (1991: 79-81) and van Dijk (1977: 79).
Search reading
This differs from skimming in that the purpose is to locate informa
tion on predetermined topic (s), for example, in selective reading
for writing purposes. It is often an essential strategy for complet
ing written assignments.
The process, like skimming, is rapid and selective and is likely
to involve careful reading once the relevant information has been
located. Like skimming, bottom-up and top-down processing are
therefore involved. Unlike skimming, sequencing is not always
observed in the processing of the text although it is likely to be
more linear than scanning. The periods of closer attention to the
text tend to be more frequent and longer than in scanning. It
normally goes well beyond the mere matching of words to be
found in scanning activities, and might include the following
operationalisations where appropriate:
■keeping alert for words in the same or related semantic field
(unlike scanning, the precise form of these words is not certain)
■using formal knowledge of text structure for locating information
■using titles and subtitles
■reading abstracts where appropriate
■glancing at words and phrases.
Examples of search reading activities can be found in Ellis and
Tomlinson (1988: 86-7), McGovern et al. (1994: 12), Morrow
(1980: 15, 17, 37, 39) and Paran (1991: 55).
The teaching of reading 215
Scanning
This involves looking quickly through a text to locate a specific
symbol or group of symbols, e.g. a particular word, phrase, name,
figure or date. The focus here is on local comprehension and
most of the text will be ignored. The rate of reading is rapid and
sequencing is not usually observed. It is surface level rather than
deep processing of text and is mainly reader-driven processing.
There is a rapid inspection of text with occasional closer inspec
tion. Pugh (1978: 53) describes it as:
finding a match between what is sought and what is given in a text,
very little information processed for long term retention or even
for immediate understanding.
The operationalisations involved might include looking for/
matching
■specific words/phrase
■figures/percentages
■dates of particular events
■specific items in an index/directory.
The Crescent Series, designed for use in the school systems in
the Middle East, offers a useful general procedure for scanning
(O’Neill et al., 1996; Teacher’s Book 8, xxii). For further exempli
fication of scanning, see Grellet (1981: 83), Morrow (1980: 18)
and Nuttall (1996: 49-51). Nuttall (pp. 51-3) also provides some
interesting ideas and examples on how graphic conventions -
print size and style, layout, spacing, indentation - help the reader
navigate a text and sometimes can signal, e.g. through different
type faces, how the text is structured.
Extensive reading
The distinction we have been drawing between careful and expe
ditious reading can easily be confused with another, earlier, dis
tinction between intensive and extensive reading. While there is
undoubtedly an overlap, there are significant differences between
the two dichotomies. The careful/expeditious distinction, taken
together with the distinction between local and global, results in a
num ber of different reading styles, or strategies, which can be em
ployed either alone or, more often, in conjunction to accomplish
216 Reading in a Second Language
a range of reading tasks. While expeditious reading is likely to be
directed at lengthy texts, there is no reason why careful reading
must be restricted to short texts. In fact, there are cogent reasons
in academic contexts as to why it should not be.
The intensive/extensive distinction, on the other hand, is largely
a pedagogical construct. Bright and McGregor (1970), for ex
ample, see them as being distinguished in terms of the num ber of
questions the teacher decides to ask about a text:
For the sake of convenience we shall discuss and exemplify exten
sive and intensive reading as though they were opposites. This will,
however, be misleading unless we think of them as lying at opposite
ends of a scale determined by question density. The point on the
scale at which we decide to work will depend on:
(i) how much there is in the passage waiting to be discovered.
Not all passages are worth meticulous attention.
(ii) how much time is available. By no means all the passages
worth serious attention can be tackled.
(iii) how much the class is capable of seeing and how well they
respond.
(iv) how much is essential to a minimum worth-while response.
(v) how hot the afternoon is - and so on.
(Bright and McGregor, 1970: 65)
Bright and McGregor (p. 80) remark that
... it is not whole lessons but parts of lessons that may properly be
so divided. In the middle of a chapter, we may stop to dwell on one
word. This is intensive study.
However, our experience in a wide range of countries suggests
that the distinction has become fossilised, with intensive reading
being confined to the classroom, where it involves the teacher
asking a large number of questions about a short text, while extensive
reading refers to either ‘silent reading’ in the classroom, or reading
done unsupervised in the library or at home, the aim being pleasure
or practice, or both.
Hafiz and Tudor (1989: 1-2) see the goal of this type of extens
ive reading as ‘to “flood” learners with large quantities of L2 input
with few or possibly no specific tasks to perform on this material’.
Nuttall (1996: 127) describes it as ‘the private world of reading
for our own interest’ and offers some valuable suggestions for
organising such activities. She argues that reading extensively is
The teaching of reading 217
the easiest and most effective way to improve reading and it is
easier to teach in a climate where people enjoy the activity as well
as value it for pragmatic reasons.
Davis (1995: 329) defines an extensive reading programme
(ERP) as:
... a supplementary class library scheme, attached to an English
course, in which pupils are given the time, encouragement, and
materials to read pleasurably, at their own level, as many books as
they can, without the pressures of testing or marks. Thus, pupils
are only competing against themselves, and it is up to the teacher
to provide the motivation and monitoring to ensure that the max
imum number of books is being read in the time available.
Williams (1984: 10) sees extensive reading as the ‘relatively
rapid reading of long texts’ (see Hedge, 1985) and emphasises
that it should normally be at the level of the student’s reading or
below it. This contrasts with careful intensive reading where the
aim is often to stretch the student slightly.
Bamford (1984: 218) claims that ‘for all but advanced learners,
the best way to promote extensive reading is by graded readers’.
Graded readers
In terms of contributions to the teaching of reading in this cen
tury, Howatt (1984: 245) singles out the work of West in Bengal in
the 1920s who argued against the prevailing orthodoxy for the
greater surrender value of basic literacy as against training in spo
ken language. West developed a system of readers using the prin
ciples of lexical selection and lexical distribution, the latter giving
the reader more practice material between the introduction of
new words - a distinct problem with earlier primers which intro
duced too many new words too quickly.
Hill and Reid-Thomas (1988a: 44) describe a graded reader as
follows:
A graded reader may be either a simplified version of an original
work or a ‘simple original’, i.e. an original work written in simple
English. In either case it is written to a grading scheme which may
be set out in terms of vocabulary, sentence structure, and, in some
cases, content.
Nuttall makes the distinction between needing to read, which
can be instigated in the classroom, and wanting to read (1996:
218 Reading in a Second Language
130), which is a greater incentive to more people. She provides
helpful advice on how to promote this, including choosing appro
priate suitable books at the right level, short, appealing, varied
and easy; on how to organise a library (pp. 133-41); on how to
organise an extensive reading programme (pp. 141-4) by creating
interest, developing incentives to read; and devising appropriate
monitoring and assessment procedures.
Survey reviews by the Edinburgh Project on Extensive Reading
(EPER) staff in E LT Journal (Hill, 1992, Hill and Reid Thomas,
1988a, 1988b, 1989, 1993) give advice on graded readers in terms
of levels, readability level, appearance, text subject matter, aids to
reading, recommended reader age, and a quality rating on a scale
of 1 to 5 (an interest rating in the 1988 reviews). Hill and Reid-
Thomas (1989) note a trend to shorter books and advise publishers
that these books may be too short and possess insufficient meat
to be used as class readers. They make a plea for some longer
readers (more than 72 pages at the upper level and more than 40
at the lower).
Davis (1995: 331-5) provides some useful advice for develop
m ent of extensive reading programmes:
■the watchwords are quantity and variety, rather than quality
■books should be attractive and relevant to students’ lives
■books should be more than sufficient in number
■books should be graded and colour coded by reading level but
students should be encouraged to move between levels
■try to make ERP school policy
■try to get it built into the timetable
■try to get financial support at least for a book box
■integrate with library studies where appropriate, for example,
through parallel grading of fiction books in the library
■develop a simple system for using the books
■develop a quick and painless system for monitoring the use of
the books.
Very similar advice can be found in Bright and McGregor (1970).
Peer interaction
Research supports the view that working together co-operatively
benefits all levels of students in mixed ability groups or pairs. It
may be used as follow up to teacher-directed activities (Pearson
and Fielding, 1991: 839). They comment:
In general, successful groups work towards group goals while mon
itoring the success of each individual’s learning as a criterion of
group success; also associated with positive growth are peer interac
tions that emphasise offering explanations rather than right answers.
It might, however, be the novelty of working with each other,
rather than the activities engendered, that has produced the results.
Nuttall (1996: 161-6) offers advice on how students might be
guided during the reading process itself, and while emphasising
the value of individuals reading in silence on their own (this is
what reading actually is), perhaps in a self-access system, she points
to some advantages of working in a group: motivation, individuals
participate more actively because it is less threatening than whole
class activity and partly because of reciprocity, the recognition that
everyone in the group should contribute. It all makes students
aware how others read and promotes thoughtful discussion on
reading strategies and skills.
What is clear is that motivation has a very strong influence on
strategy use. Williams and Burden (1997: 154) argue ‘increased
motivation and self-esteem lead to more effective use of appropri
ate strategies and vice versa’. Motivation appears to be enhanced
by co-operative learning experiences (Roehler and Duffy, 1991:
867).
Jigsaw reading is a good example of the more innovative devel
opments that have taken place in the EFL classroom (see Grellet,
The teaching of reading 223
1981; Geddes and Sturtridge, 1982; and Nuttall, 1996: 209 and
257). The class is split into groups and given only partial informa
tion on a topic situation or story. The groups are then reorgan
ised so each member has a different piece of information from
which the new group has to reassemble the whole. Unless you
have information from all of the texts you cannot understand
something im portant in the story or situation or perform a key
task. Williams (1984: 115-11) provides some good examples of
this technique and also of what he terms enquiry strategy, where the
groups decide what information they would like to find out as a
pre-reading task.
Will it work?
Where teachers’ messages convey how to construct meaning in
reading, a positive effect has been found. Pearson and Fielding
(1991: 848) detail how this might involve focus on text structure;
encouraging students to connect background knowledge to text
ideas to make inferences, predictions, and elaborations; or prompt
students to ask their own questions about the text. In their wide
review of the research, Pearson and Fielding (1991) found such
interventions to be at least moderately successful although trans
ferability was not tested for. Hoffman (1991: 942-3) also provides
data which show a positive result on achievement measures for
explicit teacher explanation.
The effective teacher will help students develop an awareness
of reading strategies necessary for successful encounters with text.
Explicit instruction in strategies and skills discussed in Sections
4.2 and 4.3, and in the section on testing, were found by Pearson
and Fielding (1991: 849) to be helpful, particularly where the
focus is on ensuring that students understand when and why they
228 Reading in a Second Language
might be employed (see also Roehler and Duffy, 1991: 867 et seq.
and Paris et al., 1991). Cohen (1998: 19) reports:
The findings of the study would suggest that explicitly describing,
discussing and reinforcing strategies in the classroom - and thus
raising them to the level of conscious awareness - can have a direct
payoff in student outcomes.
Following instruction by questions that help mediate and build
up student understanding was also found to be of value by Roehler
and Duffy (1991: 872) in their review of the LI research literat
ure. Nuttall (1996: 181-91) also examines the value of question
ing in the classroom and argues that it provides a window into the
students’ mental processes; especially where answers are wrong,
opportunities arise for learning through ‘thoughtful searching’.
Questions which make the reader work are advocated as these
focus attention on difficult elements of the text, especially where
follow-up questions get students to reveal how they arrived at their
answers. The use of MCQ format may have a place here, especially
if the distracters perform this function and useful discussion may
result. Any questions which promote discussion have a valuable
role to play as they get learners thinking about reading and devel
oping interpretative strategies (see Paran, 1993, for examples of
these in teaching materials).
Roehler and Duffy (1991: 864-6) outline the importance of
planning by teachers to identify critical features for their students,
to simplify the tasks and create effective examples. The selection
of tasks students are asked to do will constrain the operations
students acquire and how they interpret learning experiences. To
motivate students, tasks should encourage students to engage
in cognitive activity appropriate for the intended outcome and
students should be aware of the purposes behind these activity
structures. Teachers should specify clearly how the learning experi
ence is intended to be useful, and the expectancies they have of
their students. Hoffman (1991: 923) provides some empirical evid
ence for the positive effect of the latter on student performance.
Remediation
What happens when they do not learn? Hoffman (1991: 915)
refers to a frequently cited frustration for teachers as dealing with
and meeting the needs of students experiencing difficulty in the
The teaching of reading 229
reading classroom. He presents data to suggest that the slow pac
ing in low ability groups ‘does not appear to hold any promise or
pay off in terms of successful reading development’ (p. 936).
The high incidence of teacher correction often at the point of
error in reading aloud in low-ability groups is also seen as debil
itating and helps create an even wider gap between high- and
low-ability groups (pp. 937-8).
Johnston and Allington (1991: 985-6) feel that the very use of
the term ‘rem ediation’, with its connotation of sickness of the
child, creates an unfortunate role structure for the children tagged
in this way. They suggest that we would look at the situation differ
ently if we used the terms ‘children with different schedules for
reading acquisition’ or ‘children we have failed to teach’. They
question taking students out of mainstream programmes and
show how those in many remedial programmes often receive less
reading instruction than those in the classes they have been taken
from; read less text and spend less time reading any text. In such
programmes teachers’ expectations of students are lowered with
consequent effects on the way teachers interact with the students
and the results obtained. Those who get off schedule in reme
diation hardly ever get back on (p. 998) and it may be the nature
of the instruction they receive, e.g. a focus on decoding, rather
than meaning which keeps them that way (p. 999).
Johnston and Allington (1991) argue that the only way to deal
with it is to eliminate the need for it in the first place and most
effectively by early intervention before problems are compounded.
Class sizes might be reduced but the solution is likely to lie in
higher quality instruction and the creation of non-competitive
tasks involving concepts to increase involvement and co-operation
- in fact, through many of the co-operative procedures discussed
above and below. In this way remediation might become interces
sion or friendly intervention by consent or invitation (p. 1005).
Nuttall (1996: 144-5) offers some sound advice for those who
simply read too slowly in the EFL classroom. She also argues that
special attention may be needed and the provision of lots of easy
readers is insufficient on its own.
Effective schools
Successful reading instruction is more than just an interaction
between teacher, students and materials, however. The context in
which learning takes place can have an important effect and it is
necessary to consider this wider environment for learning as well,
even if the teacher can do little about it outside the walls of his or
her own classroom.
Hoffman (1991) surveyed research into effective schools, as
defined by performance on standardised reading test scores, and
The teaching of reading 231
identified a num ber of features that are potential contributors
though reservations are expressed about methodology, generalis-
ability and validity. They included:
■clear school mission
■strong curriculum leadership usually from a head
■instructional efficiency: ‘utilisation of resources to achieve max
imal student outcomes’
■high expectations for students
■good atmosphere; safe, orderly and positive
■commitment to improvement
■individualisation
■careful evaluation of student progress
■reading identified as an important instructional goal
■breadth of material available
■attention to basic skills
■communication of ideas across teachers.
Future research
INTRODUCTION
Throughout this book so far, we have tried to indicate our prefer
ence for claims and conclusions based on empirical data, rather
than rhetoric and good-hearted sentiment. We finish, therefore,
with a chapter devoted to considerations of some future research
directions for teaching and testing reading in an L2.
The relationship between research and teaching is complex,
and worthy of a study by itself. A single publication by Goodman
(1967) had what some consider to be a disproportionate amount
of influence on the teaching of reading both in the LI and L2. It
remains to be seen whether the work of Stanovich and others has
anything like the same effect. The huge change in L2 from a
focus on linguistic structures to emphasis on communicative use
was not motivated by empirical data as is normally understood by
the term; it seems to have occurred at least in part by an upswelling
of dissatisfaction among the teaching community with the previ
ous paradigm.
The characteristics of the participants involved in teaching and
testing on the one hand, and research on the other, are again of
interest. Both sides have their virtues and vices. Teachers and
testers are usually in contact with real learners, and often form
hunches regarding these learners based on extended experience,
which may well be valid, even if difficult to substantiate. They
often lack, however, an explicit theoretical framework against which
to relate these hunches, and thus evaluate them and extend them.
Researchers, on the other hand, usually operate within a theoret
ical framework. However, they may not bring experience to bear
on this framework. Moreover, they sometimes rely uncritically on
234 Reading in a Second Language
not well-substantiated or repeated experimental results. Here we
make an attempt to bring the two worlds together, to evaluate teach
ing and testing methods in the light of theories, and vice versa.
It is sometimes fashionable in teaching circles to sneer at ‘theory’
(which sometimes seems to extend to everything except method
ology). To the extent that some teacher trainers, having escaped
from the classroom themselves, have appeared to exclude ‘prac
tical’ considerations from what they tell students, this reaction is
understandable. We, however, do not go very far in our sympathy.
Teachers and testers need both a sound grasp of practical mat
ters, and an educated framework on which to base and to evaluate
their methods. We firmly believe that without such a framework,
the teacher or tester is trapped in a particular set of practices,
with no motivated criteria for making alterations. Often they are
reduced to evaluating teaching practices in terms of whether the
students enjoyed them or not.
The monitor
The notion of cognitive monitoring has been part of the reading
construct for more than a decade. Baker and Brown (1984: 22)
define it as:
The ability to use self-regulatory mechanisms to ensure the success
ful completion of the task, such as checking the outcome of any
attempt to solve the problem, planning one’s next move, evaluating
the effectiveness of any attempted action, testing, and revising one’s
strategies for learning, and remediating any difficulties encoun
tered by using compensatory strategies.
They add that:
Since most of the cognitive activities involved in reading have as
their goal successful comprehension, a large part of cognitive mon
itoring in reading is actually comprehension monitoring.
Future research 247
In our model, the monitor is closely connected to the goalsetter.
Its principal task is to keep a running check on whether the goal(s)
are being achieved. Thus, it is important to realise that the mon
itor varies in what it considers successful activity, in terms of the
task set by the goalsetter. To drop for a moment into a fashion
able computer analogy, the monitor may be ‘set’ at different val
ues.4 Thus the monitor should demand different standards of
behaviour when the reader is, for example, scanning, as com
pared to when the reader is engaged in careful global reading. If
readers do not vary the ‘setting’ of their monitor, then in many
cases their goal will not be achieved.
Modes of measuring
In theory it should be possible to investigate the reading process
at all the relevant points along the line. We could, for example,
investigate the activity of the goalsetter, of the monitor, the suc
cess or failure of the sentence by sentence processing activity, as
set out in our model, or the product. In practice, it is likely to be
difficult to distinguish the contribution of the different compon
ents: the product, for example, may be deficient either because
the initial formulation of the task was defective, or because the
monitor did not operate efficiently while reading was going on.
However, we can suggest general areas of investigation, and be
more specific when we discuss particular reading styles.
The goalsetter may be investigated by interviewing readers as
to how they interpreted the task. Alternatively, the relationship
between goalsetter and product may be examined. There is ample
evidence in the literature of different tasks resulting in different
products (Thomas and Augstein, 1972; Rayner and Pollatsek, 1989:
452-3).
Baker and Brown (1984: 23ff) suggest various ways of invest
igating the monitor, including:
■observation of readers
■analysis of oral reading errors
■assessing certainty concerning responses
■cloze procedure
■text disruption.
We are happier with some of these than with others: oral reading
has, in our opinion, to be handled very carefully, if at all, in the
248 Reading in a Second Language
L2 classroom. Nevertheless, it is useful to know that some pro
gress has been made towards an established range of investigatory
methods.
Since we are distinguishing between expeditious and careful
reading, an obvious area of observation is speed of reading. In the
absence of any more sophisticated method of assessing reading
speed, there are two general ways of going about this in the class
room: either the students read a text and are timed doing it, then
reading speed is measured in terms of words read per minute (cf.
Fry, 1963), or the students are given a reading task and a fixed
time in which to accomplish it. On the whole, in the investigations
suggested below, we favour the first method. It brings with it,
however, the problem of whether to include some measure of
success on a reading task, as well as simply speed of reading. Fry,
for example, uses multiple-choice comprehension questions to
be answered after reading. Since no reference back to the text is
allowed, this introduces a memory factor. It may be best simply
to instruct readers to read a text at their ‘norm al’ speed, and
time their performance. However it is done, when we go on to
compare, say, scanning speed with a more careful reading speed,
we need some estimate of ‘norm al’ reading speed to use as a
base line.
We now proceed to look at the different reading styles, with
a view to making suggestions as to how their existence or non
existence may be investigated among L2 readers. We begin with
scanning, which might be considered an extreme case.
Scanning
We have already defined scanning as ‘reading selectively, to achieve
very specific goals’, e.g. finding the num ber in a directory, finding
the capital of Bavaria (in a geography or history book). Nuttall
(1982) defines it as:
glancing rapidly through a text either to search for a specific piece
of information (e.g. a name, a date) or to get an initial impression of
whether the text is suitable for a given purpose (e.g. whether a book
on gardening deals with the cultivation of a particular vegetable).
Consideration of these definitions suggests that scanning is not
quite the simple concept that we originally thought. The definition
we had originally in mind covered instances like finding a solitary
Future research 249
numerically expressed date, or finding the word ‘Munich’. Neither
of these activities, however, seems very natural. A more natural
event, such as being asked ‘Find the date of the battle of Water
loo’, might involve a search for the collocation of Waterloo and a
date, but once the reader had found the general area, it also
involves a scan towards the beginning, where such introductory
information would be likely to occur. A similar situation would
arise if one were given a book on the geography of Germany and
asked to find the capital of Bavaria.
It looks as if scanning merges with what we have called ‘search
reading’. In what follows, however, we will stick to examples of
what one might call ‘extreme’ scanning, i.e. activities similar to
those carried out by the computer on the instruction ‘Find’. We
might add that such activities are very different from ‘normal
reading’. Rayner and Pollatsek, discussing ‘proof reading’, query
whether the activity has implications for normal reading. We think
that scanning, as defined above, may have such implications in an
L2 context - a point we discuss below.
When we consult our model, it might seem that all that is
involved in scanning, as defined above, is word recognition; there
is no need in the cases above for processing the syntax or seman
tics of the sentence containing the search item, and no need,
apparently, for the reader to bring background knowledge into
play. In fact, one might conceivably argue that the reader does
not even need to access the lexicon, since it would presumably be
possible to ‘scan’ a text for a nonsense word. Certainly, numbers
in telephone directories do not require access to the lexicon. We
referred above to the facility most popular word processor programs
contain for ‘finding’ particular words. The computer accomplishes
this by a process of running through the text, matching each
word it comes across with the search item. There is no need for
meaning. We don’t know of any research involving readers scan
ning in a language unknown to them; it seems likely that it would
be an exhausting experience.
Our principal research aim in this case is to determine whether
our L2 readers can scan as well as read ‘normally’, in other words
whether they have access to more than one strategy. A secondary
aim, given that the answer to the first question is ‘Yes’, is to invest
igate how they are doing it.
The obvious way to investigate the first question would seem to
involve comparison between scanning and, say, normal reading. If
250 Reading in a Second Language
one just gave the students a text, and told them ‘Find the follow
ing words in the text’, there is nothing to prevent them from
plodding through at normal speed, identifying each target word
in the course of normal reading. In fact, if they have only one
strategy, then this is what they will do. According to our classifica
tion of reading behaviours, scanning belongs to the ‘expeditious’
group, i.e. it should be carried out at a faster speed than normal
reading. Hence one way of establishing whether students can scan
is first to find out their normal reading speed (which is best done
over a number of trials), then find out whether their scanning
speed is faster. There are several elementary precautions to take.
If we already have a measurement of each student’s normal read
ing speed, then we do not have to worry too much about the
length of the text(s) used for scanning. Scanning may seem to
require texts of substantial length, which beginning students might
find exhausting to read carefully. However, texts used for scan
ning and careful reading should probably be similar in terms of
familiarity and difficulty.
The number of items to be looked for in such a test is presum
ably a matter of experience. If dates are used (they might seem
suitable as a practice to familiarise students with the activity), then
only a small number, i.e. 1 to 3, would seem enough. If the in
structions are ‘Find the following words or phrases’, then 1 or 2
would probably be too small, and easily missed. Ten would seem a
possible compromise.
We are not restricted to comparative reading speeds as a means
for assessing whether scanning or some other reading behaviour
has taken place. In a scanning operation, we see the monitor as
set at a simple Yes/No standard; i.e. is x the word the reader is
looking for or not? This is the case in a computer Find operation.
If the answer is ‘No’, then the search moves on and the last item
examined is dismissed from attention. This means that if a scan
ning operation has been carried out successfully, not only will all
the items requested have been identified, but, in theory at least,
nothing else from the text has been recovered, i.e. committed to
memory. Gibson and Levin (1974: 539) remark that: ‘One can
scan for a graphic symbol or a word target very fast, but the scan
ner remembers almost nothing of what he saw except the target.’
An examination of the reader’s knowledge of the contents of the
text just scanned, carried out, perhaps, a few minutes after the
scanning period is complete, should reveal that the reader retains
Future research 251
little or nothing of the text. Either an interview or a test can be
used for this. We have here the odd case when the lower the
reader scores on this test, the better.
Pedagogical implications
We have already indicated some doubts as to whether scanning as
we have defined it is reading at all. As stated above, Rayner and
Pollatsek had doubts as to whether editing tasks had ‘implications
for normal reading’, i.e. whether conclusions based on behaviour
on such tasks had any relevance to discussions of normal reading.
We think that scanning is relevant to L2 reading in two ways.
The first way is partly methodological. In our experience, some
L2 readers insist on one style of reading - a relentless, slow plod
through the text, beginning at the top left-hand corner, and con
tinuing to the end, the process only broken up in some cases by
frequent recourse to a dictionary or to the teacher as a dictionary
equivalent. This form of reading behaviour, which can be quite
hard to break, may be an epiphenomenon, i.e. a product of previ
ous experience in the classroom. A ruthless insistence on scan
ning on the part of the teacher may help break this pattern.
The second way is more basic. We have already seen in Section
2.2 the importance that is now attached to automatic word recog
nition in reading. It is arguable that what makes scanning easy for
a good LI reader is just such recognition; the monitor is able, very
quickly, to provide its ‘Yes/No’ answer. The L2 reader, on the
other hand, being less able to distinguish between, say, blip, flip,
bleet, fleet, etc., is likely to find scanning much more difficult. In
fact, it is again arguable that such readers will have to consult the
context of a target word in order to identify it. Thus scanning can,
at the very least, be used as a useful test of word recognition, not
only of the target items but the surrounding items in the text.
Skimming
The reader is referred back to the relevant text of Sections 2.4, 3.2
and 4.3 where we have discussed these strategies in some detail. In
Section 2.4 we provided initial working definitions which were
then developed in Section 3.2 on testing. In Section 4.3 we tried
to break each down further into its constituent enabling opera
tions for use in classroom learning tasks. We define skimming as
252 Reading in a Second Language
‘expeditious reading carried out for the purpose of extracting
gist’. It thus contrasts strongly with our description of scanning
(see also Appendix 1 for working definitions used in the Chinese
Advanced English Reading Test).
Authors are in agreement as to the value of skimming. Nuttall
considers that skimming (together with scanning) enables the
reader ‘to select the texts, or the portions of a text, that are worth
spending time on’, thus suggesting that skimming is a preliminary
to careful reading. Rayner and Pollatsek (1989: 447) remark that:
Skimming ... is a very important skill in our society. In careers that
depend on the written word, there is simply too much information
to be assimilated thoroughly, and we are constantly forced to select
what we look at. Those unable to skim material would find they
spend their entire day reading.
Rayner and Pollatsek equate ‘speed reading’ with skimming, and
say that:
speed readers appear to be intelligent individuals who already know
a great deal about the topic they are reading and are able to suc
cessfully skim the material at rapid rates and accept the lowered
comprehension that accompanies skimming. (p. 448)
The gist which we mentioned above should be something like a
reduced form of Kintsch and van Dijk’s macrostructure. A look at
the model will suggest that, in the process of skimming, even if an
entire sentence is processed, the reader will not necessarily proceed
to the next text sentence. The process is therefore selective, some
parts of the text being omitted. In turn, and keeping in mind
Kintsch and van Dijk’s model, this means that coherence relations
between different propositions in the text are likely to be sacrificed
in the act of skimming. It follows from this that, if the product
of the skimming is to be coherent, then background knowledge is
going to have to play an increased role in the build up of the
macrostructure. Rayner and Pollatsek consider that the successful
speed reader (i.e. skimmer) already knows a lot about the topic.
It seems to us, then, that the efficiency with which L2 readers
skim a text is likely to depend crucially on their knowledge, either
of the topic of the text being skimmed, or the structure of the
text, or both, and that this is likely to be even more the case than
with careful reading. This familiarity may come either from previ
ous reading in the L2, or from their previously acquired literacy
in the LI.
Future research 253
Given that tasks aiming to induce skimming can be framed
with sufficient clarity, and that the readers are familiar with what
is expected of them, more than one hypothesis can be derived
from the above discussion.
■On the same text, or on texts of equivalent length and difficulty,
reading speed when in skimming mode will be significantly faster
than when in careful reading mode.
■Macrostructure built up on the basis of skimming should be
significantly less detailed than that acquired through careful
reading. If the skimming is efficient, however, the propositions
omitted from the macrostructure should be the lower level, more
detailed ones.
■Skimming performance should be significantly reduced if the
readers are exposed to texts of unfamiliar structure.
Obviously experimental work designed to test these hypotheses
would incorporate timed reading, as well as recall protocols eli
cited after the reading was complete. A useful controlled exercise
in teaching skimming consists of students being given very gen
eral, high-level questions before they read, for example ‘give a
title for the passage’. This method could be used together with
while-reading observation of students to find out how they pro
ceeded to answer the questions. Signs of skimming might be rapid
movement between pages, possible regression back across pages,
as well as rapid completion of the exercise. Self-report protocols
might also provide insightful data (see Cohen, 1998).
Careful reading
With careful reading, we are in a somewhat different position
concerning speculation as to what strategies are available to stu
dents. It would be the view of many teachers of reading that stu
dents are definitely able to read carefully both locally and globally,
whereas this might not be the case for our other posited types of
reading. This may or may not be a valid assumption. The hypo
thesis would be that students would perform significantly better on
questions aimed at careful reading than on other types of reading
(see Section 5.3 below). Also, their speed of reading in the former
would be similar to their speed when timed during ‘normal read
ing’ sessions.
254 Reading in a Second Language
Kintsch and van Dijk (1978: 371) speculate that:
If a long text is read with attention focused mainly on gist compre
hension, the probability of storing individual propositions of the
text base should be considerably lower than when a short paragraph
is read with immediate recall instructions.
Their first case concerns skimming, in our terms; the second is an
instance of what we would refer to as careful local reading.
When comparing our two types of careful reading (global and
local), it is obvious that speed is not going to be an issue. We can
concentrate on other aspects of the reading process, an important
one being performance on tasks (see Section 5.3 below). In Sec
tion 3.2 we reported evidence of L2 students performing better
on global as opposed to local questions. It looks as if some L2
readers, at least, are using background knowledge to compensate
for linguistic deficiencies when reading for global meaning. An
attractive area of research here would be to locate such students,
then investigate in detail their reading behaviour at low levels as
described in the model, i.e. lexical access, syntactic and semantic
processing, and establishment of cohesive links.
Conclusion
The pedagogic literature frequently refers to careful reading, skim
ming, scanning, as well as ‘intensive’ and ‘extensive’ reading. We
have indicated that we do not think these types have been defined
with sufficient clarity in the literature. We have attempted in Chap
ters 2, 3 and 4 to clarify the differences and establish operational
definitions for the five ‘m ain’ skill and strategy groupings. It is our
hope that if a num ber of researchers can be persuaded to work in
the areas discussed above, and if they publish their results with
due attention paid to texts and methods used, together with full
description of the students involved, we may acquire a solid set of
empirical data support and help to develop pedagogical practice.
We now turn from general considerations of theory and how this
might be developed to a particular area in the pedagogy of read
ing that we feel is in urgent need of investigation by teachers,
testers and researchers. In our discussion of how to investigate the
relationship between the teaching of grammar and its impact on
reading ability, close attention is paid to the definition of terms as
the authors feel this to be the sine qua non of acceptable research.
Future research 255
5.2 READING AND THE TEACHING
OF GRAMMAR
Introduction
As the research topic related to the teaching of reading, we are
suggesting an investigation of the relationship between the teach
ing of reading in the classroom and the teaching of grammar. We
have chosen this topic for a number of reasons. First, we do not
believe it has ever been systematically investigated. We share
Bernhardt’s surprise that there has been so little work in this area
(Bernhardt, 1991a). Secondly, it would probably be generally
agreed that the processing of syntax, as part of the wider pro
cessing of written information, is a ‘low level’ skill, and, as Eskey
(1988) has argued, such skills have tended to be neglected in the
teaching of reading in a foreign language. Thirdly, it seems to us
that many FL teachers have an interest in and knowledge of gram
mar, so that the topic should be accessible to them.
A pedagogical approach
We should make clear that, in proposing this topic, we are not
querying the view that knowledge of grammar plays a part in the
reading process. All the models of reading reviewed in Section 2.1
either assert or assume that syntactic knowledge is a component,
and we see no reason to question this. What we are proposing is
an investigation into whether the teaching of grammar in the L2
classroom has a discernible effect on the students’ ability to read
the L2. The research is thus pedagogically focused.
Having chosen this research topic, the most obvious focus of
interest concerns whether a conscious, taught knowledge of gram
mar has an effect, hopefully beneficial, on students’ reading per
formance. There are, however, other areas of interest involving
grammar which it would be worth while to examine, e.g. whether
there are particular areas of grammatical knowledge which seem
to correlate positively with reading performance.
What follows is not intended to be an outline of a specific
research programme, rather an indication of how we think re
search might be carried out in a particular area. Because we have
criticised other writers for lack of clarity in the use of terms, we
256 Reading in a Second Language
shall spend rather longer than perhaps expected in discussing
terms, particularly the term ‘grammar’. We are strongly of the
opinion that, unless the researcher has a clear idea of what is
meant by this term, any results obtained will be immediately open
to criticism.
Defining syntax
Such divergences can also be found in pedagogical texts. Some
pedagogical grammars, e.g. Allen ( 1974), Berman ( 1979), Shep
herd et al. ( 1984) include ‘Reported Speech’ among obviously syn
tactic structures such as the Passive, Defining/Non-defining clauses
(Berman), Infinitive and Gerund (Allen). By contrast, in Morgan
and Batchelor ( 1959) ‘Direct and Indirect Speech’ is a separate com
ponent from ‘Grammatical Study , being placed beside ‘Punctua
tion and ‘Style and Vocabulary . Thus in the early textbook, Reported
Speech is clearly (and correctly) not being viewed as a strictly syn
tactic phenom enon. More generally, the grammar by Downing
and Locke ( 1992) contains far more ‘communicative’ information
260 Reading in a Second Language
than does that of Quirk and Greenbaum (1973) which, while
eclectic, is more determinedly structural.
Definitions
Sampson (1975) defines syntax as ‘. . . how words are put together
to form sentences’. According to Horrocks (1987: 24):
Syntax is concerned with the principles according to which words
can be combined to form larger meaningful units, and by which
larger units can be combined to form sentences.
For Crystal (1997) syntax is ‘the way in which words are arranged
to show relationships of meaning within (and sometimes between)
sentences’. As aspects of sentence syntax, he mentions hierarchy,
grammatical function, concord, and transformations (pp. 94-7).
Even in these brief definitions there appear to be some significant
divergences. For example, Horrocks, to some extent, and Crystal,
in particular, seem to attach much more importance to meaning
than does Sampson. Crystal’s inclusion of relationships between
sentences would be rejected by many. Hence it may be useful
to examine, rather than definitions, writers’ accounts of general
aspects of syntax.
General principles
Bolinger (1975) considers that ‘the first rule of syntax. . . is that
things belonging together will be together’. As well as this prin
ciple (which is certainly true for English, but seems less true for,
say, classical Latin), he cites, as coming inside the scope of syntax,
operators, both grammatical morphemes and function words, the
structure of phrasal constituents, word classes (even though he
claims the classes are ‘basically semantic’), grammatical, logical
and psychological functions (e.g. ‘subjects’, ‘objects’, etc.), the
grammatical functions of sentences (e.g. declarative, imperative,
transformations), and ‘higher sentences’ (e.g. performatives).
Horrocks and Crystal both refer explicitly to meaningful syn
tactic relationships. As will be seen later, just how much meaning
is involved in syntax is an important question for the type of
research we are investigating, and will be reviewed later. In Halliday
and Hasan (1976) there is a suggestion that the difference between
grammatical and lexical meaning is a question of generality, gram
matical items being more general than lexical ones, e.g.
Future research 261
The general words (e.g. ‘thing’, ‘person’, ‘idea’) . . . are on the bor
derline between lexical items and substitutes.
(p. 280; our parentheses)
Bolinger returns on more than one occasion to the notion that
grammar (syntax and morphology) is primarily concerned with
intra-language relationships (an example is the role of the com-
plementiser ‘that’ which functions to show that a following clause
is embedded), whereas semantics is concerned with ‘real-world’
relationships. A similar notion can be found in Halliday and Hasan,
where reference relations are ‘semantico-pragmatic’ while substitution
acts at the Texico-grammatical’ level (pp. 88ff). Bolinger, however,
points out that the distinction is difficult to maintain. In the phrase,
‘J ane’s house’, the possessive morpheme refers to a relationship
in the real world: Jane either owns or occupies the house. In the
phrase ‘J ane’s cooking’, on the other hand, the same morpheme
serves to show that ‘J ane’ is the grammatical subject of the verb
‘cooking’. Hence the distinction, while undeniably there, is of
little use in determining what is syntactic and what is semantic.
General principles, while interesting, appear to be insufficient
to decide with some degree of precision what does or does not
constitute ‘grammar’ or ‘syntax’. Faced with this problem, we have
two choices: either we can define, say, ‘syntax’ operationally, saying,
in effect, ‘this is what we consider to be syntax for the purposes of
this research’, or we can adopt a particular model, saying, in effect,
‘this is a model of syntax; anything described by the model there
fore comes within the bounds of syntax’.
A formal model
At this point we ought to come clean and state that what we are
really looking for is a ‘formal’, ‘structuralist’ model, with as little
recourse to ‘meaning’ or ‘communicative value’ as possible. Given
the general popularity in recent language teaching of approaches
stressing meaning or communication, this would seem a rather
strange, even perverse, approach to take. And in taking it, we are
not, of course, decrying the importance of meaning or of com
munication. We have, in fact, two reasons for our preference. The
first is related to what we discussed earlier with reference to the
Heaton example: unless we are very clear about what we mean
about grammar, our research will be open to criticism that it
262 Reading in a Second Language
incorporated a lot more than just grammar. We believe that adopt
ing a formal approach minimises this danger.
Our second reason will become clearer as we progress, but
concerns the fact that we see one aspect of the research as cor
relating performance on grammar/syntax tests with performance
on reading tests. It seems clear that the more text-focused or
‘communicative’ our grammar model is, the closer our grammar
tests based on this model will be to tests of low-level reading skills.
There is, however, little point in correlating such tests with read
ing tests: little can be learned by correlating A with A. Therefore,
we would argue, we should begin, at least, with seeing whether we
can find a correlation between linguistic competence, measured
by a test of formal syntax, and linguistic and communicative per
formance, measured by a reading test.
Choosing a model
We are looking, then, for a formal model of grammar, one which
sets out to describe the permissible (grammatical) sequences of
words or formatives in the sentences of whichever language we
are concerned with. However, we quickly run into a major prob
lem: while there is no shortage of theoretical formal grammars,
particularly in the Chomskyan tradition, they have evolved to a point
where they have major disadvantages for the classroom experi
menter, being either very difficult and abstract, and hence inac
cessible, or insufficiently developed at a descriptive level, or both.
Pedagogical grammars, on the other hand, are predictably less
abstract, and are far more accessible to teachers and researchers.
However, for understandable pedagogical reasons, they do not
limit their description to syntactic structures. For example, in A
University Grammar of English, Quirk and Greenbaum (1973) pro
vide a quite detailed description of formal aspects of the English
verb phrase, but accompany this description with information about
the use of verb forms, e.g.:
In indicating that the action is viewed as in process and of limited
duration, the progressive can express incompleteness even with a
verb like stop whose action cannot in reality have duration; thus the
bus is stopping means that it is slowing down but has not yet stopped.
The progressive (usually with an adverb of high frequency) can
also be used of habitual action, conveying an emotional colouring
such as irritation. (p. 41)
Future research 263
Similarly, the Cobuild English Grammar (Sinclair, 1990), and related
volumes, combines very formal lists of verb patterns with distinctly
non-formal information such as:
You can also use ‘be possible to’ with ‘it’ as the subject to say that
something is possible. You usually use this expression to say that
something is possible for people in general, rather than for an
individual person. (p. 239)
Thus theoretical grammars are formal, but inaccessible, while ped
agogical grammars are accessible but contain much non-structural
information.
As a compromise, we propose that researchers allow a theor
etical grammar to set the limits of formal syntax, while filling in
the descriptive details from pedagogical grammars. If we choose
Government/Binding Theory, for example (see Horrocks, 1987),
the Base Component, which includes the Lexicon, will define as
syntax the following:
(i) The structure of lexical phrases, and of the sentence, includ
ing embedded sentences.
(ii) Transformations relating, say, Interrogative and other Wh-
constructions to declarative structures.
(iii) A huge variety of syntactic restrictions included in the Lex
icon. In addition, the morphological component will include
information about the parallelism of structures such as ‘New
ton developed the theory’ and ‘The development of the theory
by Newton’.
It is, of course, a major feature of G/B Theory that it contains a
num ber of subtheories designed to filter out deviant sentences
which the Base Component has allowed to be generated. On the
whole we do not think these subtheories are relevant to our needs,
so we do not detail them.
With the fundamentals of syntax thus outlined by the theory,
we can now go to the pedagogical grammars to flesh out the des
criptive detail. A University Grammar of English appears to be very
suitable for details of lexical phrases and embeddings, while the
Cobuild English Grammar, or the Cobuild Grammar Patterns 1: Verbs
(Francis et al., 1996) appears highly suitable for verb patterns.
This combination description can then be used, at least initially,
as the basis for grammar tests in the research outlined below.
264 Reading in a Second Language
Correlational studies
The easiest way of investigating whether a relationship exists in L2
between reading and grammar is to measure students’ performance
on tests of grammar and reading and then correlate the results.
Alderson (1993) did this with a specially constructed ‘grammar’
test and various modules of the IELTS test then being constructed.
He found, in general, high correlations between the grammar test
and the different modules. In fact, correlations were high between
virtually all the tests:
The relationship between Reading and Listening is as close as or
closer than the relationship between one reading test and other
reading tests! (p. 213)
He concludes that ‘the results, then, appear to show that a (vaguely
defined) generalized grammatical ability is an im portant compon
ent in reading in a foreign language’ (p. 218). It is im portant to
stress that correlational studies of this sort do not point to a causal
relationship. Alderson’s results, as he himself makes clear, could
be interpreted as meaning either that grammatical ability improves
reading, or that reading ability improves performance on a gram
mar test, or that the relationship is the result of a third factor,
which he terms ‘language proficiency’. Indeed, given the high
correlations between all his tests, this last might seem to be the
most likely explanation. Moreover, from the point of view of the
research being suggested here, correlational studies involve test
ing but not necessarily teaching. Alderson’s tests are proficiency
tests, unrelated to any particular teaching syllabus. Nevertheless
such studies do highlight certain aspects of the problem which we
shall examine before moving on to more pedagogically oriented
studies.
First, the test of grammar used must, as far as possible, be just
that, i.e. it must relate to a clear definition of what constitutes
grammar; hence the extended discussion above. Alderson’s test
consisted of six subsections: (i) vocabulary; (ii) morphology; (iii)
prepositions, pronouns, etc., along with rather vaguely termed
‘lexical sets’; (iv) verb forms, etc.; (v) transformations; and (vi)
‘reference and cohesion’. In our definition of ‘grammar’, subsec
tion (i) must be eliminated; the ‘lexical sets’ of subsection (iii)
are doubtful. As far as subsection (vi) is concerned, ‘reference’ is
normally treated as a form of cohesion; the latter, if it concerns
Future research 265
relationships outside the sentence, will not be classed as syntactic,
while identifying pronominal links between cohesive items and
their referents, etc., has more to do with pragmatics than syntax.
Secondly, as Alderson makes clear, the grammar and reading
tests should be as separate from each other as possible. Since most
grammar tests involve the students in reading, this is not an easy
task. A grammar test can be viewed as a specialised reading test.
However, there are a num ber of steps we can take to reduce the
resemblance. We have already said that syntax, in most defini
tions, is sentence-bound. Comparatively little written text is sim
ilarly sentence-bound. Therefore we can achieve some measure of
difference by making our grammar test consist of decontextualised
sentences or phrases. This will tend to rule out the use of continu
ous text, as in cloze procedure, or the gap-filling of continuous
text recommended by Heaton (1988) and used by Alderson.
It was largely this wish to have the widest possible distance be
tween grammar and reading that influenced the discussion above
in the direction of strictly formal, as opposed to more functional
or semantic, grammars. Alderson classes the test he used as ‘com
municative’ on the grounds that ‘we wished to test a student’s
ability to process and produce appropriate and accurate forms in
meaningful contexts’, justifying this on the grounds that ‘the ability
to manipulate form without attention to meaning is of limited
value and probably rather rare’ (p. 218). In general we would agree,
but consider that this argument is irrelevant for our research; in
reading we would require a student to process appropriate forms
in meaningful contexts. What we are investigating is whether a
knowledge of formal syntax is of help in this activity.
Alderson refers to ‘meaningful contexts’ and the undesirability
of teaching ‘form without attention to m eaning’. The question of
meaning is a particular problem in the context we are discussing.
Since reading involves the extraction of meaning, it is clear that
any grammar test involving a heavy emphasis on meaning is likely
to overlap with reading tests. When listing a num ber of grammar
test-types, Heaton remarks that they test ‘the ability to recognize
or produce correct forms of language rather than the ability to
use language to express meaning, attitude, emotions etc.’ (p. 34).
However, one of his types involves matching sentences like ‘Tom
ought not to have told m e’ with possible paraphrases. From the
viewpoint of G/B Theory, this looks like the interface between
syntax and semantics. Significantly, Heaton (1988: 35) remarks
266 Reading in a Second Language
that ‘such an item may be included either in a test of reading
comprehension or in a test of grammar’. We would like to restrict
such items either to grammar tests or to reading tests, and would
suggest that if they are included in grammar tests, care should be
taken to exclude inferences, together with the referential and
sense meanings of lexical items.
Beyond correlations
Correlations, if they exist, between performance on syntactic and
reading tests are interesting and worth while examining. How
ever, the research design touched on above involves what have
traditionally been termed proficiency tests, i.e. they are not con
structed with reference to a particular syllabus. This conflicts with
our stated aim at the beginning of this section, namely to investig
ate the relationship between reading and a taught syntactic com
ponent. In addition, as pointed out above, correlations do not
establish causes, only relationships. If we want to test whether a
taught grammar course has a measurable effect on subsequent
reading performance, we need a different research design.
A pedagogically more relevant research design would involve
comparison between one group, the experimental group, and an
equivalent group, the control group. Both groups are given a read
ing pre-test at the beginning of the experimental period. Ideally
there should not be a significant difference between the means
of the two groups, but this is not strictly necessary. During the
experimental period, which would probably extend over a term or
semester, the experimental group is taught grammar, according
to a grammar syllabus based on the topics outlined above. Normal
reading instruction could continue, but should be kept separate
from the grammar instruction. During this time the control group
gets the same reading instruction, but is not given the grammar
classes, having some other activity, preferably not related to read
ing, put in their place. At the end of the experimental period,
both groups are given a reading post-test. The null-hypotheses are:
(a) if there was no significant difference between the groups on
the pre-test, there should be no significant difference on the post
test; (b) neither group should show a significant improvement in
scores on the post-test as compared to the pre-test. If, on the other
hand, either of the null-hypotheses was overturned, this would be
Future research 267
evidence that grammar teaching had the effect of improving read
ing performance. We should perhaps add, as a third possibility,
that a significant effect may be found, but will be negative, i.e.
that the grammar teaching had a detrimental effect on reading.
Care would have to be taken that the grammar component was
taught as formal grammar, i.e. that ‘communicative’ elements were
excluded as much as possible. Given the emphasis placed now
adays (probably rightly) on the value of such communicative em
phasis, this might be a problem for some teachers. However, our
own experience has been that students are quite receptive to formal
grammar, so this may not be a major problem. It should be noted
that in this research design, a grammar test is not strictly necessary,
though it could be included.
A number of such experiments, conducted with different groups
of students in different places, should establish whether the teach
ing of formal syntax, at least, had a beneficial effect on reading
performance.
Conclusions
Grammar is a component of reading which has been almost
ignored in the research. It seems to us that this is an interesting
and potentially valuable research area which L2 teachers and
applied linguists are in a good position to investigate.
270 Reading in a Second Language
5.3 THE USE OF TESTS TO INVESTIGATE
COMPONENTIALITY IN READING RESEARCH
In the previous section we have stressed the importance of defin
ing terms clearly and adequately as the basis for valid and reliable
research. In addition to a clear idea of what is to be investigated,
research needs to be credible to an outside audience in terms of
design, sampling, methodological procedures, analysis and report
ing. It should be logical with clear progression from research
questions through data collection, analysis, conclusions and recom
mendations. Crucially it should be systematic with clearly stated
procedural rules not only to guard against threats to validity and
reliability but also to allow replicability by other researchers.
Davies (1990) has described language testing as the cutting
edge of applied linguistics. He supports this argument by suggest
ing that one of the single most effective measurement tools for
exploring the nature of language proficiency or language acquisi
tion is the language test. We share the view and believe that lan
guage tests offer a reliable and rigorous means for exploring the
componentiality of reading though we are also well aware that
there are a num ber of qualitative procedures that can produce
data that complement those generated by language tests.
Language tests tell us little of the processes that underly read
ing and we need to employ different methodological procedures
to investigate these. In particular, introspective methods may help
shed light on underlying thought process (see Cohen, 1987; Cohen
et al., 1979; and Rankin, 1988). Olson et al. (1984) point out a
num ber of problems associated with the method, such as the time
taken to administer and analyse, limited sampling and sensitivity
to instructional variables. However, methods such as introspection
and retrospection may offer insights into the perceived processes
that take place during different types of reading and help us
understand the nature of the differences in processing as well as
the existence of such differences.
In this section we are limited by space. Our concern is thus
with the latter, i.e. componentiality, the issue of the divisibility of
the construct, rather than unpacking the mental processes. The
discussion below is, accordingly, for the most part limited to test
ing. This should not be taken to signify that we consider other
methodological procedures, or the investigation of process, as being
any less important.
Future research 271
Research is only as good as the tools that are used to operation
alise constructs. Inadequacies or limitations in these will constrain
the value of any research. In order to investigate the componential-
ity of reading systematically, we need to develop maximally valid
operationalisations of what we believe to be the im portant ele
ments of that construct in the form of texts and associated tasks.
This is not just the concern of researchers. Anyone who teaches
reading in the language classroom is putting into practice his or
her view of the construct of reading every time reading-connected
activities are carried out by their students. Any of these activities
are open to investigation to evaluate their worth in relation to
impact on students’ reading abilities.
We propose below a number of a priori and a posteriori proced
ures, which constitute a systematic approach to investigating the
componentiality of the reading ability of students principally
through testing. They should help illuminate whether reading is a
unitary activity or whether it is made up of separable components,
for example: expeditious types of reading as in search reading,
skimming, scanning for specifics, and careful reading at the global
and local levels. They will tell us about the relative contribution of
the posited skills and strategies to the overall picture of a student’s
reading ability. They will tell us about the relationship between
these components and inform us of the relative weaknesses and
strengths of our students. W hether for formative or summative
purposes, such evaluation can impact on whole educational systems
as well as individual classrooms.
These procedures (mutatis mutandis) should be applicable to
the development of any reading test from national to classroom
level. Within the constraints of the classroom all of the proced
ures may not be practical at one particular point in time, but
every teacher should be aware that, to produce the most accurate
picture of a student’s transitional performance in reading, all have
a contribution to make. The data from these procedures are all
grist to the construct validity mill. They can all shed light on what
it is we are measuring and how well we are doing this. The more
of these we can embrace in our research investigations into read
ing the better founded might be our findings.
There are no short cuts in rigorous research. This does not
mean it is the preserve of the few or the well resourced. Small-
scale research systematically carried out can be synthesised to pro
vide real advances. What is required, however, is a comprehensive
272 Reading in a Second Language
but common framework of description of what is to be invest
igated available to researchers, teachers and testers as well as the
development of systematic procedures that allow full comparison
across studies. Language testing encourages explicitness in speci
fication and through its potential systematicity offers the possibility
of generalising beyond a particular study.
An overview of the research methodology to investigate the com-
ponentiality of reading is presented below. The exemplification
is from our investigations into EAP reading in China (see Appen
dix 1) and in Egypt (see Khalifa, 1997). However, the methods
and approach are generic and should for the most part apply to
all reading situations.
A METHODOLOGY FOR INVESTIGATING THE
EAP READING CONSTRUCT
Example
To establish a specification of operations and performance con
ditions to be tested we pursued a num ber of avenues in the de
velopment of the Advanced English Reading Test (AERT) for
undergraduates in China (see Appendix 1 for background to the
project and for details of the specification). The following tasks
were carried out:
■A needs analysis of reading in the EAP context through docu
m ent inspection, interviews, group discussion and questionnaire
demonstrated the need for expeditious reading strategies/skills
as well as careful reading; for coping with longer (1000-3000 word
texts) as well as shorter texts (<1000). It emphasised the need to
select texts from journals and books rather than newspapers.
■A review of theories of the reading process and available research
findings (together with the needs analysis) showed the need for
embracing a wider view of reading which would take into account
expeditious reading as well as the more usual careful reading
and for considering comprehension at the global as well as the
local level.
Future research 275
■An analysis of current learning tasks used in teaching materials
in reading English for Academic Purposes demonstrated the
need to go beyond the traditional concern with slow careful
reading to include tasks focusing on quick, efficient, selective
reading and raised the issue of prediction activities in relation
to activating existing schemata.
■An analysis of test tasks used in assessing reading English for
Academic Purposes showed, among other things, the importance
of providing a purpose for each reading activity; the importance
of controlling time spent on each activity; the importance of
establishing a minimal and maximal level of topic familiarity;
the types of format that lent themselves to the testing of the
various operations.
These data clarified the nature of reading operations across dis
cipline areas in the Chinese academic context and led to the spe
cification in Appendix 1. These investigations also provided data
on the conditions for reading activities in EAP. The specifications
for performance conditions in the Chinese AERT are also listed
in Appendix 1. Once appropriate operations and conditions are
established these have to be implemented in a test.
A priori validation
Textmap content of texts to establish content to be extracted
according to purpose for reading
Potentially appropriate texts for a test population should initially
be selected by a moderating committee from a bank of such texts
on the basis of as close a match as possible with the performance
conditions laid out in the specifications (see Section 3.3 for a
discussion of these conditions). The committee would at the same
time need to confirm that the texts selected allowed the testing of
the intended operations.
A practical method of doing this is to establish the content that
might be extracted from a text in line with the established pur
pose for reading it. Various systems of text analysis are proposed
in the literature (see Section 2.2, text structure) but, though im
pressive in their attention to detail and replicability/reliability, they
consume inordinate amounts of time and the end results do not
necessarily enable the researcher or test developer to decide which
276 Reading in a Second Language
are the important ideas in a text for testing. A more utilisation-
focused procedure is to try to establish the main ideas of a passage
through expeditious or careful ‘textmapping’ procedures (see
Buzan, 1974; Geva, 1980, 1983; Nuttall, 1996; Pearson and Field
ing, 1991).
In each textmapping procedure an attempt should be made to
replicate a single type of reading on a single text, e.g. reading a text
slowly and carefully to establish the main ideas. The product of the
particular reading of a text can be compiled in the form of a
spidergram or as a linear summary. This is first done individually
and then, the extent of consensus with colleagues who have fol
lowed the same procedure is established. The objective of the pro
cedure is to examine whether what we have decided is important,
is in line with the specified type of reading activity and matches what
colleagues consider important (see Sarig, 1989, for an interesting
empirical investigation of this procedure).
This is a crucial first step in trying to ensure the validity of our
tests. We would be concerned that the answers to the questions we
then wrote revealed the important information in the text that
could be extracted by the particular type of reading being assessed.
An ability to answer the items should indicate that the candidate
has understood the passage in terms of successful performance of
the specified operation (s).
To illustrate the technique of textmapping and to demonstrate
how this can help to summarise in note form the products of
reading a text for a variety of purposes, we give an example in
Appendix 2 of the procedure for textmapping a text for developing
a test of careful reading.
The procedure would remain the same for other skills and
strategies but the key conditions of
■time allowed for the textmapping
■length of text
would alter in line with discussion on performance conditions
presented in Section 3.3 above.
The parts of the EAP reading construct in the exemplification
below are expeditious and careful reading at both global and
local levels (see Section 3.2 and 4.3 above for a discussion of these
issues). We would, therefore, want to test these strategies/skills
using different passages (to ensure the independence of items
and to avoid the possibility of muddied measurement). For this
Future research 277
reason a different set of procedures is necessary for each to reflect
as closely as possible the processing behind each of those skills or
strategies.
It is im portant to note that the time for the textmapping task
provides a benchmark for the actual test time. All too often test
constructors take considerable periods of time reading and re
reading texts and they peel off deeper and deeper levels of mean
ing. They then give candidates 20 minutes or so to reach the same
depth of understanding under exam conditions. This is obviously
a nonsense. The candidates would not normally be expected to
find answers to questions in a shorter period than it has taken the
test setter. Conversely, if one wishes to test expeditious strategies
then the tim e/text length ratio should not allow test takers to
process in a careful non-selective fashion.
The textmapping procedure represents a principled way of
avoiding this particular threat to the test’s validity. If it can be
done with students who are at a suitable level of reading ability
drawn from the population who will eventually take the test, this
may be even more valid than using language or subject specialists
to perform this activity.
Write more questions, and use more texts, than you will need
The textmapping procedure will provide the content for each
section of the test. It will also show whether the passage is suitable
for its intended reading purpose. Where it is possible to produce
a consensus textmap this, then, needs to be converted into appro
priate test items in the format selected. Where consensus is not
achieved or the textmapping produces too few items, these texts
must be rejected!
Wherever possible it is advisable to write more items than are
needed in case some of them do not work. One cannot tell in
advance of empirical trialling those items that will work and those
that will not. It is best to trial items on small numbers initially,
because it may well take two or three attempts before problems in
wording are resolved. Try the test on colleagues or a few students
at a time. If the test is piloted on all immediately accessible can
didates to begin with, then this could be problematic. The import
ant trialling on larger numbers should not take place until it is
reasonably certain that the items are working well and any obvious
problems have been eliminated.
Careful attention needs to be given to:
■Rubric: Are the instructions clear, accessible and unequivocal?
■Timing: Is the timing for each reading type on each passage
appropriate given the length of the text and the activities we are
expecting test takers to perform on it?
■Order of questions/process dimension: Do the order of the questions
set on a particular passage and the order of the reading types in
the test as a whole encourage the reading behaviours we are
hoping to test?
■Layout: Does the layout help the students to work efficiently
through each subtest; does it appear elegant and neat?
A posteriori validation
Trial on reasonable sample
Once the necessary development preliminaries described above
are completed, it is important to trial the test on as broad a sample
of the intended population, in terms of ability, as possible and then
subject the results to statistical analysis to establish the test’s value
as a measuring instrument.
It is im portant at the trialling stage to administer the research
instrument to as normally distributed a sample as possible. This
might mean purposefully sampling from top, medium and lower
universities, institutions, schools and classes within these. Normally
distributed data allow the researcher to apply the statistical ana
lysis recommended below to establish how the items in the subtests
are functioning. A skewed sample, where the majority of students
are too strong or too weak, will not allow the researcher to do
this. This is why samples of less than 30 are normally not recom
mended. The distribution of scores achieved by the sample on the
test should allow two standard deviations to fit in either side of
the mean. In Figure 5.1 the test is out of 60, the mean is 32.9 and
the standard deviation is 9.91. So in this data set we can get two
standard deviations (2 x 9.91) easily either side of the mean. This
tells us that we have a distribution approximating to normal and
we can continue with the further analysis discussed below.
280 Reading in a Second Language
70-
60 ■
50-
40 ■
SO-
20 ■
1 and 0.2 is often taken as the cut-off point for acceptable dis
crimination. Items 02 and 06 are not discriminating very well and
both would need to be considered carefully for exclusion.
Discrimination is important in developing the reading test bat
tery because it demonstrates that an item can reliably discriminate
between a person who has the ability and a person who does not.
It may well be that, on samples of students who take the test later
because they are all good or weak, such discrimination is lacking.
This does not matter if the item has been shown to discriminate
in the piloting on a normally distributed representative sample of
the large potential population.
There may be cases where an item tests comprehension of an
im portant idea and we have to decide whether we can accept a
low facility value and a lower positive index of discrimination. A
negative discrimination index is never acceptable. The validity of
what we are testing must always come first. If the main idea of a
passage proves easy to extract, so be it. The text has been selected
as representative of the domain the respondent has to cope with
on a principled basis.
282 Reading in a Second Language
Table 5.1(b) Reliability analysis: scale (alpha)
Statistics for scale
Mean Variance SD No. of variables
8.7888 8.8095 2.9681 15
Item - total statistics
Scale mean Scale Corrected Alpha
if item variance item - total if item
deleted if item correlation deleted
deleted
ITEM 01 8.0627 7.9331 0.2690 0.6711
ITEM 02 8.0693 8.1442 0.1802 0.6820
ITEM 03 8.4257 7.8148 0.2832 0.6695
ITEM 04 8.1155 7.8177 0.2935 0.6681
ITEM 05 8.2145 7.6128 0.3481 0.6605
ITEM 06 7.9142 8.3039 0.2069 0.6776
ITEM 07 8.0792 7.8215 0.3072 0.6663
ITEM 08 8.0792 7.7354 0.3429 0.6618
ITEM 09 8.2739 7.8684 0.2459 0.6748
ITEM 10 8.3069 7.5843 0.3536 0.6597
ITEM 11 8.1452 7.7868 0.2960 0.6677
ITEM 12 8.1815 7.8643 0.2573 0.6730
ITEM 13 8.5314 7.8459 0.3146 0.6656
ITEM 14 8.5413 7.9776 0.2642 0.6717
ITEM 15 8.1023 7.6021 0.3869 0.6558
Reliability coefficients'. No. of cases = 303.0; Alpha = 0.6836;
No. of items = 15
Internal consistency
These estimates are often cited as indicators of reliability, but they
are just as useful in exploring the construct validity of a subtest.
They evidence that items within a subtest are measuring a con
struct in a similar fashion. Table 5.1(b) provides some data on
the internal consistency of a pilot version of a subtest designed
to measure careful reading for main idea extraction. You can see
the internal consistency data in the column headed ‘Alpha if
item deleted’. The alpha for the data is given at the bottom of
Table 5.1 (b) at 0.6836. The ‘Alpha if item deleted’ column tells us
whether an item is contributing to the overall internal consistency
Future research 283
of the test. In this case the overall alpha for this component would
not be improved by removing any item although it is notice
able that items 02 and 06 are contributing the least among the
15 items.
The data on facility value, discrimination and internal consist
ency can all help the researcher to take decisions on how to select
items which provide the best measure of the construct they wish
to investigate through the test(s).
Estimate of reliability
Marker reliability
It is also important to demonstrate that the data have been marked
reliably, otherwise the reliability of the results themselves will be
affected. It is usual to xerox a num ber of answer sheets, 30 plus,
and to have these marked by all the markers involved in the study.
Ideally markers should themselves receive the same set of sheets
at a later stage for remarking to establish intra-marker reliability.
The reliability of a test is a combination of its internal consistency,
inter- and intra-marker reliability. Formulae exist for combining
these to provide an overall reliability estimate which will help the
reader of the research to understand the extent to which one can
depend on the results.
greater difficulty with the latter than with the former. These cross
tabulated data are summarised in Figure 5.3 which provides an
even clearer view of the potentially differing performance abilities
in these two areas. With a notional pass mark of 60 per cent (9/15)
substantially more would fail the local but pass the global careful
reading test (85) than vice versa (31).
If these data reported in the analyses above were to be repeated
using differing measures and different samples, then we might
begin to synthesise an argument for the divisible nature of the read
ing construct.
0.00
1.00 1 2 3
2.00 1 1 1 3
3.00 2 3 2 1 8
4.00 1 2 2 4 i 1 11
5.00 1 1 2 2 1 2 3 4 1 1 18
6.00 1 1 1 4 1 1 3 5 2 1 1 26
7.00 1 1 6 5 2 -2 5 2 1 1 28
8.00 1 1 2 2 4 2 5 7 1 1 40
9.00 1 3 5 *| 4 > V 3 4 2 38
10.00 1 L* 1 1 5 | j f?i 9 4 2 1 40
11.00 J If! ■ III 2 4 5 3 2 26
12.00 2 Ill 2i 4 3 3 1 3 2 25
13.00 I j II i 2 5 5 1 2 23
14.00 i -i 111 2 2 2 2 13
15.00 1 1
Total 2 4 10 24 22 24 28 41 36 33 29 24 12 5 9 0 303
Future research
Figure 5.2 Cross-tabulation of total careful reading and total lexical items.
287
288 Reading in a Second Language
TOTLEXI
TOTCARE 0 .0 0 ---------►8.00 9 .00 ------- ► 15.00
0.00
106 ■
1
8.00 I
9.00
i 81
15.00
Introspection
An introspection study into the students’ process of reading texts
and answering the questions should be carried out to find out
what skills and strategies students are using in completing each
section of the test. A procedure for this was developed with Shang
hai Jiaotong University staff in the AERT project, PRC. The students
were trained to think aloud onto tapes in a language laboratory
while taking the test. Before the test, a training session was necessary
to demonstrate what they were expected to do during the test.
Students should be allowed to use LI if appropriate if necessary in
their verbal reports. The data are then transcribed and content
analysed in terms of the test operations.
Retrospection
A separate retrospection study enables the researchers to obtain
a larger data set (than is possible through the time-consuming
spoken protocols) to establish student perceptions of the skills
and strategies used in the process of taking the test. This can be
carried out in the large-scale trialling of the test. It can be incor
porated into the process of doing the test by providing a checklist
for candidates to tick after they finish each section of the test (see
Table 5.4).
Experts’judgements
Apart from students’ introspection and retrospection, language
testing experts and reading experts should be asked to give their
professional opinion of the constructs being tested. Table 5.4 is
290
Figure 5.4 Extract from the test feedback questionnaire used in the AERT project in China.
QUESTIONNAIRE TO CANDIDATES
H M | L I M j H M L N ft M f • L j K \ AH ST ML
2
3
Design
Rubrics
291
Table 5.4 Student retrospection/expert judgem ent sheet
292
Reading quickly to
get the overall idea
of a text
Reading quickly to
search for information
on main ideas
Table 5.4 (Cont’d)
Reading quickly to find
specific information:
words/numbers/
symbols
Revision
As a result of the qualitative and quantitative investigations de
scribed above the researcher is well equipped to make any neces
sary amendments to the pilot version of the research instruments
to make them more valid operationalisations of the intended
construct.
On the basis of the procedures discussed above we would have
sufficient data to help us revise our test instruments to ensure that
they come closer to performing the job intended.
Conclusion
None of this is easy to follow through. Not all of these things are
possible for the teacher in the classroom except over an extended
period of time. What is clear, however, is that the more rigorous
and comprehensive we can be in our investigations, the clearer
the account that is likely to emerge of the nature of reading.
Clear specification of what we are trying to teach or test, and
soundly conceived methodologies for investigating components
and processes, are essential. It is hoped that this book is of some
value in this endeavour.
Notes
1. Richard Joung, November 1997.
2. One of the authors teaches British undergraduates, who regularly
assure him that the initial sequence [kn] is impossible to pronounce,
a judgement that might have seemed strange to their ancestors.
3. This is a broader definition than that used by Bernhardt, who restricts
her attention to examination of the reading process.
4. Only for a moment, though. The effect of such analogies is to give
a spurious impression of precision regarding fairly vague and ill-
understood processes.
This page intentionally left blank
297
Appendix 1
Conditions Descriptions
Size of input/length of text 3 short passages (approx. 600-900) for careful reading (global) 15 items
3 short passages (approx. 250-500) for careful reading (local) 15 items
3 long passages (approx. 1000-1800) for expeditious reading (global) 15 items
3 long passages (approx. 1000) for expeditious reading (local) 15 items
Speed of processing 144 minutes for a total of 12 passages: about 60-90 wpm for careful reading; 100-150 wpm for
Appendix 1
expeditious reading.
Control over skills/strategies Three passages for each skill/strategy, one from arts and humanities, one from science and technology,
one from life and medical science. For careful reading, passages are short and may sometimes have
relatively implicit text structure. For expeditious reading, passages are long and may sometimes have
relatively explicit text structures.
Control over time spent Time is strictly controlled both for each section and for each passage within the section.
Careful reading (global): 60 minutes, 20 for each passage;
Expeditious reading (skimming): 15 minutes, 5 for each passage;
Expeditious reading (search reading): 21 minutes, 7 for each passage;
Expeditious reading (scanning): 18 minutes, 6 for each passage;
Careful reading (local): 30 minutes, 10 for each passage.
Amount of help General instructions (in Chinese) to candidates are provided 15 minutes before the test. Instructions
for each section are clearly written on a separate page in the question booklet and students are
reminded to read instructions before texts. Example provided for the truth/false/justification items
since candidates are not familiar with format.
Number and ordering of tasks Order for the five sections: careful reading (global), skimming, search reading, scanning, careful
reading (local).
Method factor/response mode Formats include: SAQ, true/false, table/flow chart/sentence/text completion.
Question/answer in Ll/TL Mainly in English but could be in Chinese if necessary.
Receptive/productive Mainly receptive, some limited writing involved in SAQ but only brief answers will be required; no
more than 10 words.
Explicitness of weighting All items equally weighted
Table A1.2 A taxonomy of skills and strategies in reading for academic purposes
Types of reading Expeditious reading strategies
strategies and skills
Skimming Search reading Scanning
Purpose Processing a text selectively to get Locating information on Looking quickly through
the main idea(s) and the discourse predetermined topic(s) (e.g., a text - not necessarily following
topic as efficiently as possible in the form of questions set the linearity of the text - to locate a
- which might involve both on main idea(s) in a text). specific symbol or group of symbols:
expeditious and careful reading. This normally goes beyond e.g., a particular word, phrase,
■To establish a general sense of mere matching of words (as name, figure, or date.
the text; in scanning). The process is
■To quickly establish a selective but is likely to involve
macropropositional structure careful reading once relevant
as outline summary without information has been located.
decoding all the text;
■To read more efficiently;
■To decide the relevance of texts
to established needs.
Operationalisations Where appropriate to text-type: Keeping alert for words in the Looking for (matching):
■Reading title and subtitles same or related semantic field ■specific words/phrases
quickly. (not certain of precise form of ■figures, percentages
■Reading abstract carefully. these words). ■dates of particular events
■Reading introductory and Using formal knowledge for ■specific items in index
concluding paragraph carefully. locating information.
Appendix 1
text.
306
Appendix 2
MAPPING A TEXT - I
Time for the mapping activity should not exceed the actual test
time. The aim is to identify items that can be used to test careful
reading at the global level, that is those which test main ideas.
Text length, c.600 words
Stage I
■On a master sheet of paper, list the main ideas on which the
group members agree.
Appendix 2 307
■Write the num ber of people who agreed on them, e.g. 4/5
(4 out of 5 agreed).
■Normally agreement of N - l is necessary, i.e. if there are 5
people in the group, at least 4 must have included the point
for a consensus.
Notes may not be taken: The aim is to extract only the main ideas/
the macropropositions and to avoid jotting down micropropositions
or minor detail. If careful reading is an incremental process then
arguably we cannot establish the macrostructure until we have
read all the text. By not allowing notes to be taken we are tiding to
avoid experiences in the past where mindmappers have written
down a lot of peripheral detail. If we only transfer important
information from working memory to long-term storage then this
might be a way of achieving that. Mental rehearsal/m onitoring of
what is im portant is an indicator of good reading.
508
References
Author Index
Subject Index