Toward A Theory of Automatic Information Processing in Reading
Toward A Theory of Automatic Information Processing in Reading
6, 293-323 (1974)
Among the many skills in the repertoire of the average adult, reading
is probably one of the most complex. The journey taken by words from
their written form on the page to the eventual activation of their meaning
involves several stages of information processing. For the fluent reader,
this processing takes a very short time, only a fraction of a second. The
acquisition of the reading skill takes years, and there are many who do
not succeed in becoming fluent readers, even though they may have
quickly and easily mastered the skill of understanding speech.
During the execution of a complex skill, it is necessary to coordinate
many component processes within a very short period of time. If each
component process requires attention, performance of the complex skill
will be impossible, because the capacity of attention will be exceeded.
But if enough of the components and their coordinations can be processed
automatically, then the load on attention will be within tolerable limits
and the skill can be successfully performed. Therefore, one of the prime
issues in the study of a complex skill such as reading is to determine how
the processing of component subskills becomes automatic.
’ This research was supported by a grant ( HD-06730-01) to the authors from the
National Institutes of Child Health and Human Development, and in part by the
Center for Research in Human Learning through National Science Foundation Grant
GB-17590. Reprints may be requested from David LaBerge, Department of Psy-
chology, University of Minnesota, Minneapolis, Minnesota 55455.
293
Copyright @ 1974 by Academic Press, Inc.
AlI rights of reproduction in any form reserved.
294 LABERGE AND SAMUI~Q
requires attention, then the response latency will include both time for
stimulus processing and time for attention switching. If, however, the
stimulus processing does not require attention (i.e., it is automatic), then
the response latency will not include stimulus processing time, assuming
that the stimulus processing is completed by the time attention is
switched.
VISUAL MEMORY
f---
l
l 9
k’
0
/’
Yl
/
A
FIG. 1. Model of visual memory showing two states of perceptual coding of visual
patterns. Arrows from the attention center (A) to solid-dot codes denote a two-way
flow of excitation: attention can activate these codes and be activated (attracted)
by them. Attention can activate open-dot codes but cannot be activated (attracted)
by them.
from the attention center to visual codes represent the activation of these
units by attention. When a stimulus occurs and activates a code, a signal
is sent to the attention center, which can “attract” attention to that code
unit in the form of additional activation. Only the well-learned, filled-
circle codes can attract attention. If attention is directed elsewhere at
the time a visual code is activated by external stimulation, attention will
not shift its activation to that visual code, unless the stimulus is intense
or unless the code automatically activates autonomic responses or other
systems which mediate the “importance” (Deutsch & Deutsch, 1963) or
“pertinence” (Norman, 1968) of that code to the attention center.
The many double arrows emanating from the attention center, therefore,
indicate potential lines of information flow to every well-learned code in
visual long-term memory. At any given moment, however, the attention
center activates only one code. This characteristic of the model represents
the limited-capacity property of attention.
As conceptualized here, attentional activation may have three different
effects on information processing. First of all, it can assist in the con-
struction of a new code by activating subordinate input codes. For ex-
ample, in Fig. 1, successive activation of features f, and f, is necessary to
synthesize letter code 1,. Secondly, activation of a code prior to the
presentation of its corresponding stimulus is assumed to increase the rate
of processing when that stimulus is presented (LaBerge et al., 1970).
Finally, activation of a code can arouse other codes to which it has been
associated, as will be described later in connection with Fig. 3.
Some of the most common patterns we learn to recognize are letters,
which are represented in Fig. 1 as 11, I,, etc. We assume that the first
stage of this learning requires the selection of the subset of appropriate
features from the larger set of features which are activated by incoming
physical stimuli. For example, assume that a child is learning to discrimi-
nate the letters t and h. The length of the vertical line is not relevant to
the discrimination of these two letters. Instead he must note the short
horizontal cross of the t and the concave loop in the h. These are. the
distinctive features of these letters when considered against each other.
In the model, we represent the selection of features of a given letter by
the lines leading from particular features to a particular letter. In this
example, the length of the vertical line is an irrelevant feature, but when
these two letters are compared with other letters, for example the letter n,
the length of the vertical line becomes a relevant feature, We ‘assume
that this kind of adjustment in feature selection continues as the rest of
the alphabet is presented to the child. One feature that seems to be ir-
relevant for all letters is thickness of line. It would appear, then, that
many features are selected for a given letter, and in many cases letters
AUTOMATIC READING PROCESSES 299
systems such as the phonological system. This position is close to the one
taken by Gough ( 1972), who makes a strong attempt to reconcile letter-
by-letter visual scanning with the apparently high rates of word process-
ing by fluent readers.
One could even move the argument up another level to consider spelling
patterns as the typical units of visual perception in reading, a position
preferred by Gibson ( 1971), although she maintains that these units must
eventually be reorganized into still higher-order units.
The critical point being made here is that automaticity in processing
graphemic material may not necessarily mean that unitizing has taken
place. Scanning pathways may have been learned to the degree that they
can be run off automatically and rapidly, whatever the size of the visual
code unit involved. The present model as depicted in Fig. 1 adapts itself
to the view that a letter may be a cluster of discrete features which are
scanned automatically. One simply equates the symbol 1, with the term
( f3, f,), to indicate that the code at the letter level is a cluster of feature
units. The interpretation of automaticity is the same. For the dashed lines
linking features with letters, the features cannot be adequately scanned
without the services of attention; for the solid lines, the features are
scanned automatically. For present purposes of exposition, however, we
find it more convenient to refer to letters, spelling patterns, and words as
unit codes, but we hope that the reader wilI keep in mind that there is
an alternative view of what it is that is being automatized in perceptual
learning of this kind.
Before extending the model to other stages of processing, such as
sounding letters, spelling patterns, and words or comprehension of words,
we will describe briefly an experiment which attempted to measure
automaticity of perception and use it as an indicator of amount of per-
ceptual learning of a graphemic pattern.
Indicators of automatic perceptual processing. One way to test recog-
nition of a letter is to present two letters simultaneously and ask the
subject to indicate if they match or not (Posner & Mitchell, 1967). In
order to determine whether a person can automatically recognize a letter
pattern, we must present a pair of patterns at a moment when he is not
expecting them. The way this was done in a recent study (LaBerge,
197313) was to induce the subject to expect a letter, e.g., the letter a, by
presenting it first as a cue in a successive matching task. If the letter
which followed the cue was also an a, he was to press a button, otherwise
not. Occasionally, following the single letter cue a, the subject was given
a pair of letters other than the letter expected, e.g., the stimulus (b b).
If these letters matched, he was to press the button, otherwise not, regard-
less of what the cue was on that trial. In terms of Fig. 1, the state of the
302 LABERGE: AN-D SAMUELS
FIG. 2. Mean latency and percent errors of matching responses to unfamiliar and
familiar letter patterns.
The results from the 16 college-age subjects are shown in Fig. 2. The
initial difference in latency between unfamiliar and familiar letters was
48 msec and the difference clearly decreased over the next four days. In
terms of the model in Fig. 1, we would say that the dashed lines between
features f, and f, and 1, were strengthened over days and approached
the automatic level of learning of the lines connecting f, and f, with 1,.
The finding that unfamiliar letters improved with practice more than
did familiar letters offers support for the hypothesis that something is
being learned about the unfamiliar letters over the days of training.
Evidence that subjects are learning automatic processing of the unfamiliar
letters is supported by a special testing condition presented to another
group of 16 subjects. In this condition, the familiar and unfamiliar pat-
terns were presented both as cues and target stimuli so that we could
assess the time taken to detect the letter when the subject expected that
letter. In terms of the model in Fig. 1, we assume for the unfamiliar let-
ter that the attention arrow to 1, is activated at the time the letter is
presented. Similarly, when a familiar letter, l,, is cued, the attention ar-
row is focused on 1, in preparation for that letter to be presented.
A comparison of latencies of these successive matches showed that the
time to make an unfamiliar match equals the time to make a familiar
match. This means that under conditions when the subject is attending
to these letters, differences between perceptual learning of letters are
not revealed. Only under conditions when the subject is attending else-
where at the moment when the test letter is presented do these differences
emerge.
Taken together, the data from these two conditions strongly suggest
that what is being learned over days is a perceptual process that operates
without attention, namely an automatic perceptual process. Whether the
304 LABERGE AND SAMUELS
VM PM RS
r(w ) Hrcsl)
P(W,)----r I
ris,l
>
p(w,l----dwz)
r&l
which can be organized with visual and phonological codes into a super-
ordinate code, indicated here by cl. These codes represent associations
that are in the very earliest stages of learning. The dashed lines connected
with the episodic code represent the fact that attention is required to
activate the code. With further practice, direct lines may be formed
between visual and phonological codes, for example the line joining
v( w,) with p( w,). This link is represented by a dashed line to indicate
that additional activation by attention still is necessary for the association
to take place. The solid lines joining visual and phonological codes, of
course, represent well-learned associations that occur without attentional
activation. Of course, all three types of associations, episodic, nonauto-
matic direct, and automatic direct, are assumed to be at the accuracy
level.
The initial association between a new visual pattern and its phono-
logical response is considered to be a fast learning process (Estes, 1970).
It may not occur on the first trial, but when it does occur, it appears to
happen in an all-or-none manner. For this state of learning, progress is
indicated customarily by percent correct or percent errors. When the
subject has achieved a criterion of accurate performance, the visual code
still requires attention whenever retrieval occurs through the episodic
memory code or through a direct dashed-line connection, even if the
perceptual coding of the visual stimulus itself is automatic. Further train-
ing beyond the accuracy criterion must be provided if the association is
to occur without attention, represented by the solid lines. The letter-
naming experiment soon to be described will serve as an illustrative ex-
ample of the associative learning this model is intended to represent.
Once a visual word code makes contact with the phonological word
code in reading, we assume that the meaning of the word can be elicited
by means of a direct associative connection between the phonological
unit, p( w1 ), and the semantic meaning unit, m( wl), as shown in Fig. 4.
Most of the connections between phonological word codes and semantic
meaning codes have already been learned to automaticity through ex-
tensive experience with spoken communications. In fact, authors of
children’s books purposely select vocabularies in which words meet this
condition. This takes the attention off the processing of meaning and
frees it for decoding. However, for a child in the process of learning
meanings of words, we assume that the linkage between a heard word
and its meaning may be coded first in episodic memory. This is repre-
sented in Fig. 4 by the organization of p( w,) and m(w,) and event el
308 LABERGE AND SAMUELS
VM PM SM
--
FIG. 4. Representation of three states of associative
visual memory (VM), phonological memory (PM), semantic memory (SM ), and
episodic memory (EM). Attention is momentarily focused on a code in episodic
memory.
and e, into the episodic code cp. Additional exposures to a word along
with activations of its meaning would begin to form a direct link between
the phonological unit and its meaning, represented by the dashed line
between p( w,) and m( wZ). At these two states of learning, attention is
needed to activate the association of a heard word into its meaning, but
with enough practice, a word should elicit its meaning automatically,
as illustrated by the solid line joining p( wl) with m( wI).
At this point we may mention that the association between the phono-
logical form of a word and its meaning may go in the other direction, SO
that activation of a meaning unit could automatically excite a phono-
logical unit. However, we are not prepared to specify in any detail how
this is done. We simply wish to indicate that generating speech by
activation of semantic structures also appears to be automatic, at least in
the general sense in which we are using the term here.
We should note the possibility in the model that a visual word code
may be associated directly with a semantic meaning code (Bower, 1970;
Kolers, 1970). That is, a unit, v( wl), may activate its meaning, m( wI),
without mediation through the phonological system. The fact that we
can quickly recognize the difference in the meaning of such homonyms
as “two” versus “too” seems to illustrate this assumption.
Indicators of automatic associative processing. The way we are cur-
AUTOMATIC READING PROCESSES 309
FIG. 5. Mean latency and percent errors of naming responses to unfamiliar and
familiar letter patterns.
learning to sound that letter and to sounding spelling patterns and words
as well.
Turning to the association of word sounds with word meanings illus-
trated in Fig. 4, it is possible to perform learning experiments using
indicators of automaticity of associating meanings in much the same way
as we did for associating names. The only major difference in procedure
is that instead of asking the subjects to name a letter, we ask him to press
a button if the word is a member of a particular category of meaning
( Meyer, 1970).
General model of automuticity in reading. In Fig. 7 all the memory
systems relevant to this theory of reading are shown together. We may
use this sketch to trace some of the many alternative routes that a visually
presented word could take as it proceeds toward its goal of activating
meaning codes. A given route is defined here not only in terms of the
particular systemic code encountered along the way, but also in terms of
whether or not attention adds its activation to any of these codes. A few
of the possible optional processing routes may be described as follows:
PM
DEVELOPMENT OF AUTOMATICITY
Throughout this paper we have stressed the importance of automaticity
in performance of fluent reading. Now we turn to a consideration of
ways to train reading subskills to automatic levels. Unfortunately, very
little systematic research has been directed specifically to this advanced
stage of learning. Reviews of studies of automatic activity (Keele, 1968;
Welford, lQ68; Posner & Keele, lQ69) deal mostly with automatic motor
tasks and, to our knowledge, there are no studies which systematically
compare training methods which facilitate the acquisition of automaticity
of verbal skills. Therefore, our remarks here will be speculative, although
we are currently putting forth efforts in the laboratory to shed light on
this problem.
First of all, we would agree with most practitioners involved in skill
learning that practice leads to automaticity. For example, recognizing
letters of the alphabet apparently becomes automatic by successive ex-
posures (see Fig. 2). Sounding spelling patterns apparently becomes
automatic by repetition of the visual and articulatory sequences. Even
the meaning of a visual word would seem to achieve automatic retrieval
through successive repetitions. Edmond Huey in 1908 emphasized the
role of repetitions in the development of automaticity when he wrote,
“To perceive an entirely new word or other combination of strokes
requires considerable time, close attention and is likely to be imperfectly
done, just as when we attempt some new combination of movements,
AUTOMATIC READING PROCESSES 315
some new trick in the gymnasium, or a new serve at tennis. In either case,
repetition progressively frees the mind from attention to details, makes
facile the total act, shortens the time, and reduces the extent to which
consciousness must concern itself with the process” (Huey, 1908, p. 104).
In the case of perceptual learning, repetitions would seem to provide
more than the consolidation of perceptions to the point where they can
be run off quite quickly and automatically. Another thing that can happen
during these repetitions is that the material can be reorganized into
higher-order units even before the lower-order units have achieved a
high Ievel of automaticity. For example, when the chiId reads text in
which the same vocabulary is used over and over again, the repetitions
will certainly make more automatic the perceptions of each word unit,
but if he stays at the word level he will not realize his potential reading
speed. If, however, he begins to organize some of the words into short
groups or phrases as he reads, then further repetitions can strengthen
these units as well as word units. In this way he can break through the
upper limit of word-by-word reading and apply the benefits of further
repetitions to automatization of Iarger units. Apparently this sort of
higher-order chunking progresses as the child gains more experience in
reading. For example, Taylor et al. (1960) found that 1st grade children
made as many as two fixations per word whereas 12th graders made one
fixation for about every two words.
Reorganization into larger units requires attention, according to the
model. We do not know specifically how to train a child to organize
codes into higher units although some speed-reading methods make
claims that sheer pressure for speed forces the person out of the word-
by-word reading into larger units. Nevertheless, we feel reasonably sure
that considerable application of attention is necessary if the reorganiz-
ation into higher-order units is to take place. When a person does not
pay attention to what he is practicing, he rules out opportunities for form-
ing higher units because he simply processes through codes that are
already laid down.
What may be critica in the determination of upper limits of word-
group units is the number of word meanings that the subject can compre-
hend in one chunk in his semantic memory. Units at the semantic level
may determine chunk size at the phonological level which, in turn, may
influence how attention is distributed over visual codes. Stated more
generally, this hypothesis says that the limiting size of the chunk at early
levels is influenced by the existing chunk size at deeper levels. If this
hypothesis holds up under experimental test, it would imply that the
teaching of higher-order units for the reader should progress from deeper
levels to sensory levels, rather than the reverse.
316 LABERGE AND SAMUELS
The model which has been presented here may have several helpful
features for the researcher concerned with reading. It provides explan-
atory power by clarifying a number of phenomena which have puzzled
318 LABEBGE AND SAMiJELS
SUPPES, P., GROEN, G., & SCHLAG-RAY, M. A model for response latency in paired-
associate learning. Journal of Mathematical Psychology, 1966, 3, 99-128.
TAYLOR, S. E., FRACKENPOHL, H., & PATTEE, J. L. Grade level norms for the com-
ponents of the fundamental reading skill. Bulletin #3, New York: Huntington,
Educational Development Laboratories, 1960.
T-BASSO, T. R., & BOXER, G. H. Attention in learning: Theory and research. New
York: Wiley, 1968.
TREISMAN, A. Selective attention in man. British Medical Bulletin, 1964, 20, 12-16.
TULVING, E. Subjective organization in free recall of “unrelated words.” Psychological
Review, 1962, 69, 344-354.
TULVING, E. Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.),
Organization of memory. New York: Academic Press, 1972.
WELFOFW, A. I. Fundamentals of skill. London: Methuen, 1968.
WICKLUND, D. A., & KATZ, L. Short term retention and recognition of words by
children aged seven and ten. Visual Information Processing, Progress Report No.
2, University of Connecticut, 1970.