0% found this document useful (0 votes)
50 views22 pages

Unit II - Perception and Attention

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views22 pages

Unit II - Perception and Attention

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Unit 2 - Perception

Perception is the ability to capture, process, and actively make sense of the
information that our senses receive. It is the cognitive process that makes it possible to
interpret our surroundings with the stimuli that we receive throughout sensory organs. Perception
is a cognitive process that allows people to take in information through their senses, then
utilize this information to respond and interact with the world. The American Psychological
Association (APA) defines perception as "the process or result of becoming aware of objects,
relationships, and events by means of the senses, which includes such activities as recognizing,
observing, and discriminating." Perception includes the five senses; touch, sight, sound, smell,
and taste. It also includes what is known as proprioception, which is a set of senses that enable us
to detect changes in body position and movement.Many stimuli surround us at any given
moment. Perception acts as a filter that allows us to exist within and interpret the world without
becoming overwhelmed by this abundance of stimuli.
Interest in perception dates back to the time of ancient Greek philosophers who were
interested in how people know the world and gain understanding. As psychology emerged as a
science separate from philosophy, researchers became interested in understanding how different
aspects of perception worked particularly, the perception of color. In addition to understanding
basic physiological processes, psychologists were also interested in understanding how the mind
interprets and organizes these perceptions. Gestalt psychologists proposed a holistic approach,
suggesting that the sum equals more than the sum of its parts. Cognitive psychologists have also
worked to understand how motivations and expectations can play a role in the process of
perception. As time progresses, researchers continue to investigate perception on the neural
level. They also look at how injury, conditions, and substances might affect perception.

Theories of perception
Top-down and bottom-up theories of perception
Psychologists often distinguish between top-down and bottom-up approaches to
information-processing. In top-down approaches, knowledge or expectations are used to guide
processing. Bottom-up approaches, however, are more like the structuralist approach, piecing
together data until a bigger picture is arrived at.
Bottom up theory of perception
Bottom-up theories describe approaches where perception starts with the stimuli whose
appearance you take in through your eye. You look out onto the cityscape, and perception
1|Unit II Perception & Attention
happens when the light information is transported to your brain. Therefore, they are data driven
(i.e., stimulus-driven) theories. The four main bottom-up theories of form and pattern perception
are direct perception, template theories, feature theories, and recognition-by-components theory.
Theory of direct perception (Ecological view)
One of the strongest advocates of a bottom-up approach was J.J. Gibson (1904-1980),
who articulated a theory of direct perception. This stated that the real world provided sufficient
contextual information for our visual systems to directly perceive what was there, unmediated by
the influence of higher cognitive processes. Gibson developed the notion of affordances,
referring to those aspects of objects or environments that allow an individual to perform an
action. Gibson's emphasis on the match between individual and environment led him to refer to
his approach as ecological. Most psychologists now would argue that both bottom-up and top-
down processes are involved in perception.
Gestalt psychologists referred to this problem as the Hoffding function (Köhler, 1940). It
was named after 19th-century Danish psychologist Harald Hoffding. He questioned whether
perception is such a simple process that all it takes is to associate what is seen with what is
remembered (associationism). An influential and controversial theorist who questioned
associationism is James J. Gibson.
According to Gibson‟s theory of direct perception, the information in our sensory
receptors, including the sensory context, is all we need to perceive anything. As the environment
supplies us with all the information we need for perception, this view is sometimes also called
ecological perception. In other words, we do not need higher cognitive processes or anything else
to mediate between our sensory experiences and our perceptions. Existing beliefs or higher-level
inferential thought processes are not necessary for perception. Eg. “THE CAT.” Yet the H of
“THE” is identical to the A of “CAT.”
Gibson believed that, in the real world, sufficient contextual information usually exists to
make perceptual judgments. He claimed that we need not appeal to higher level intelligent
processes to explain perception. Gibson (1979) believed that we use this contextual information
directly. In essence, we are biologically tuned to respond to it. According to Gibson, we use
texture gradients as cues for depth and distance. Those cues aid us to perceive directly the
relative proximity or distance of objects and of parts of objects.
Therefore, Gibson‟s model sometimes is referred to as an ecological model. This
reference is a result of Gibson‟s concern with perception as it occurs in the everyday world (the
ecological environment) rather than in laboratory situations, where less contextual information is
2|Unit II Perception & Attention
available. Direct perception may also play a role in interpersonal situations when we try to make
sense of others‟ emotions and intentions. After all, we can recognize emotion in faces as such;
we do not see facial expressions that we then try to piece together to result in the perception of
an emotion.
Neuroscience also indicates that direct perception may be involved in person perception.
Mirror neurons are active both when a person acts and when he or she observes that same act
performed by somebody else. Furthermore, studies indicate that there are separate neural
pathways (what pathways) in the lateral occipital area for the processing of form, color, and
texture in objects.
Template Theories
Template theories suggest that we have stored in our minds myriad sets of templates.
Templates are highly detailed models for patterns we potentially might recognize. We recognize
a pattern by comparing it with our set of templates. We then choose the exact template that
perfectly matches what we observe. We see examples of template matching in our everyday
lives. Fingerprints are matched in this way. Machines rapidly process imprinted numerals on
checks by comparing them to templates. Increasingly, products of all kinds are identified with
universal product codes (UPCs or “bar codes”). They can be scanned and identified by
computers at the time of purchase. Chess players who have knowledge of many games use a
matching strategy in line with template theory to recall previous games. Template matching
theories belong to the group of chunk-based theories that suggest that expertise is attained by
acquiring chunks of knowledge in long-term memory that can later be accessed for fast
recognition. Studies with chess players have shown that the temporal lobe is indeed activated
when the players access the stored chunks in their long-term memory.
Template-matching theories fail to explain some aspects of the perception of letters. We
identify two different letters (A and H) from only one physical form. Hoffding (1891) noted
other problems. We can recognize an A as an A despite variations in the size, orientation, and
form in which the letter is written.
The Prototype Theory
Rosch (1973) and Rosch (1975) proposed that rather than having a number of predefined
templates within our minds, we instead categorise percepts by referencing prototypes. Prototypes
are similar to templates in that they symbolise outlines or ideas of what an object should look
like, however unlike templates which require an exact match, prototypes rely on best-guesses
when various features are in place.
3|Unit II Perception & Attention
Feature-Matching Theories
Yet another alternative explanation of pattern and form perception may be found in
feature-matching theories. According to these theories, we attempt to match features of a pattern
to features stored in memory, rather than to match a whole pattern to a template or a prototype.
The Pandemonium Model
One such feature-matching model has been called Pandemonium (“pandemonium” refers
to a very noisy, chaotic place and hell). In it, metaphorical “demons” with specific duties receive
and analyze the features of a stimulus.
In Oliver Selfridge‟s Pandemonium Model, there are four kinds of demons: image
demons, feature demons, cognitive demons, and decision demons. In this model, the “image
demons” receive a retinal image and pass it on to “feature demons.” Each feature demon calls
out when there are matches between the stimulus and the given feature. These matches are yelled
out at demons at the next level of the hierarchy, the “cognitive (thinking) demons.” The
cognitive demons in turn shout out possible patterns stored in memory that conform to one or
more of the features noticed by the feature demons. A “decision demon” listens to the
pandemonium of the cognitive demons. It decides on what has been seen, based on which
cognitive demon is shouting the most frequently (i.e., which has the most matching features).
The Recognition by Components theory
Geons are part of a theory about how we recognize objects. The Recognition by
Components theory, developed by Biederman in 1987, incorporates the structural description
theory and says that there are 36 three dimensional shapes that all objects are made up of. These
shapes are called geometrical icons or geons (or primitives). These geons and the idea that all
objects are made up of them is very similar to the basic process of learning how to draw.
(I started drawing when I was really young. Like most kids I started doodling as
soon as I was big enough to hold a crayon. But the hobby stuck with me and
developed over the years. I was self-taught for almost my entire life and only took
an actual art class when I entered high school. It was difficult at first to kind of
unlearn the ways I was used to drawing and relearn some of the basics of
sketching. Some aspects didn‟t help improve my art at all so I didn‟t use them as
much. But the one important skill I learned that I‟ve taken with me throughout the
rest of my life was doing your initial sketching by using what are, essentially,
geons. Visually, everything, including human figures, is composed of basic 2 and
3 dimensional shapes like squares, circles, triangles, and cylinders. Once you can
4|Unit II Perception & Attention
visualize how this works, it makes drawing much easier. Take a human figure: the
head is a circle, the shoulders and all the joints are circles, the arms and legs are
rectangles or cylinders, the torso is an upside down triangle, the pelvic bone is an
upright triangle, the feet and hands are ovals with thin rectangles protruding from
them. Although a theory about how we recognize objects is obviously different
than a skill used for drawing, the similarities made it easier for me to understand
Recognition by Components theory because in a way, I‟d been practicing a
rudimentary version of it for years.)
According to RBC theory, geons are recovered from nonaccidental properties. These are
properties of edges in an image (e.g., lines) that are associated with properties of edges in the
world. To understand non-accidental properties, consider seeing a box from many different
viewpoints. From most views, observers sec three sides of the box, which terminates in a "Y"-
junction at a corner. This two-dimensional junction is an example of a nonaccidental property,
and it is associated with a three-dimensional corner.
Top Down theory of perception
Top-down Processing is an important perceptual theory in cognitive psychology. The
theory establishes the paradigm that sensory information processing in human cognition, such as
perception, recognition, memory, and comprehension, are organized and shaped by our previous
experience, expectations, as well as meaningful context (Solso, 1998).
Top-down processing suggests that we form our perceptions starting with a larger object,
concept, or idea before working our way toward more detailed information. In other words, top-
down processing happens when we work from the general to the specific; the big picture to the
tiny details. In top-down processing, your abstract impressions can influence the sensory data
that you gather.
Top-down processing is also known as conceptually-driven processing, since your
perceptions are influenced by expectations, existing beliefs, and cognitions. In some cases you
are aware of these influences, but in other instances this process occurs without conscious
awareness. Your brain applies what it knows and what it expects to perceive and fills in the blank
so to speak. For e.g. A B C and 12 13 14. Nearly everyone has had the experience of rushing
over to another person who appears to be an old friend, only to realize he or she is actually a
stranger in such cases, our tendency to process information quickily from top down can indeed
produce errors. Top down processing helps in identification and recognition of the stimuli.

5|Unit II Perception & Attention


Navon’s Study
David Navon, an Israeli Psychologist tried to tackle a central problem concerning the
course of perceptual processing. He asked the question that „do we perceive a visual scene
feature by feature? Or is the process instantaneous and simultaneous as some gestalt
psychologists believed‟? To examine this Navon developed classical paradigm which involved
the presentation of compound stimuli a large letter (global level) composed of small letters (local
level). David Navon‟s paper about the speed with which people process global and local
information is extremely popular (Navon, 1977). The paper has been cited nearly 1,800 times
(checked in the beginning of 2016). The basic idea of Navon‟s study is that when objects are
arranged in groups, there are global features and local features. For example, a group of trees has
local features (the individual trees) and the feature of a forest (the trees together). The basic
finding of Navon‟s work is that people are faster in identifying features at the global than at the
local level. This effect is also known as global precedence. The figure has a global feature, it
looks like an H. Its local feature are the many small letters X the figure is made of. People are
typically quicker detecting an H than an X.
The global precedence effect is not just observed in this specific set up. For e.g. generally
words are recognized quicker than its individual letters. The distance between the screen and your
eyes might influence how easy the global features can be detected. For e.g. if you sit very close to
the screen, the local features will be more obvious than the global features.
The Navon task (Navon, 1977) is a well-known letter identification task in which large
letters constructed from a number of much smaller letters are presented as stimuli; participants
respond to either the large or small letters while ignoring the other type. David Navon's research
demonstrated that global features are perceived more quickly than local features. Jules Davidoff
also performed research, but in a remote culture, finding opposite results; the participants more
readily identified the local features.
Effect of Context
A context effect is an aspect of cognitive psychology that describes the influence of the
environmental factors on one‟s perception of a stimulus. The impact of context effects is
considered to be part of top down design. The concept is supported by the theoretical approach to
perception known as constructive perception. Context effects can impact our daily lives in many
ways. It can have an extensive effect on marketing and consumer decisions. For e.g. research has
shown that the comfort level of the floor that shoppers are standing on while reviewing products
can affect their assessments of the products quality leading to higher assessments. If the floor is
6|Unit II Perception & Attention
confortable and lower ratings if it is uncomfortable. „THE CHT‟ is a classical example of context
effect. We have little trouble reading „H‟ and „A‟ in their appropriate contexts, even though they
take on the same form in each world. Context effect can impact our daily lives in many ways
such as word recognition, learning abilities and object recognition. The brightness of a stimulus
depends not only on its own luminance but that of the surrounding stimulation.
Configural Superiority Effect
It is form of context effects. The tendency for some complex visual stimuli such as faces
or printed words, to be more easily recognizable than any of their constituent parts presented in
isolation. There are two kinds of superiority effect: word superiority effect and object
superiority effect.
When we examine the actual speech of someone, one word flows into another; there
aren‟t sharp boundaries or pauses between words. (Think of trying to distinguish “ice cream” and
”I scream.”) In fact, that is one of the reasons why someone speaking a language we don‟t
understand typically sounds like he or she is speaking rapidly: we don‟t know how to separate
the speech stream in that language into words.
You‟ll see that the word superiority effect is quite similar. In an early
experiment, Reicher flashed either a single letter of a word on a screen. The subject‟s task was to
indicate whether a specific letter was present. What Reicher found was that people were faster at
identifying the letter when it appeared in the word than when it appeared by itself (the word
superiority effect or WSE). Now on first blush, this may seem a counter-intuitive finding. A
common-sense model tells you that you first have to identify the letters before you can identify
the word, so the reverse finding might be expected.
There have been several explanations of the WSE over the years. One early explanation
is that a word provides constraints on guessing. So, if you get the word “NURSE” flashed at you
and are cued for what the second letter might have been, if you didn‟t actually see the letter long
enough to identify it, you might still guess “U” because there are only two possibilities, and
“nurse” is a more familiar word in English than is “Norse.” In any case, you would not guess that
the missing letter was a J. However, this doesn‟t really provide much explanation for why people
are likely to miss some information when they see a single letter on screen, but seem to get
enough information when they get multiple letters forming a word.
In addition to the word superiority effect, by the way, people have also found an object
superiority effect by which component parts are recognized faster in objects than when they are
presented on their own.
7|Unit II Perception & Attention
In visual perception, a phenomenon in which a configuration or elements or features is
easier to identify than a single feature alone. When written stimuli are degraded by noise or brief
presentation, letter in words are reported more accurately than single letters and letters embedded
in non-words. E.g. for word superiority effect is reading CARD or CRAD. In visual perception
tasks, the finding that judgments about a briefly presented line are made more efficiently when
the line is part of a drawing of a 3 dimensional object than when it is part of a 2 dimensional
figure. E.g. lines in both sides….
Integration
So, is perception and pattern recognition mainly top-down or bottom-up? The answer is
that we normally do both in combination. A very important place where this is seen is fluent
reading: People who are good, rapid readers are anticipating to some extent what ought to be the
next words or ideas they will get (for example - “The majestic hawk swooped down and carried
off the struggling_.” is very likely to suggest a limited range of final words to you, even though
none has been presented. “Rabbit,” maybe?). But at the same time, relying too much on top-
down processing can be a disaster; you do need to do some reality checking to make sure the
expectations you have are consistent with the data coming in through the senses. In short, we can
moderate the relative contributions of one or the other in different situations, but we generally do
both. And once we start to get some conception of a pattern through partial evidence from
bottom-up processing, top-down processes are then likely to activate and help speed up the
pattern recognition process.
a. Depth Perception: One example of integration is depth perception. When you look at an
object, your brain integrates the different visual cues, such as binocular disparity (the difference
between the images seen by each eye) and monocular cues (e.g., relative size, perspective, and
motion parallax) to perceive the object's depth and distance.
b. Multisensory Integration: The integration approach also explains how our brain combines
information from different sensory modalities to form a unified perception of the world. For
example, when you watch a movie, your brain integrates visual and auditory cues to create a
cohesive and immersive experience.
c. Illusions: Certain visual illusions, such as the Muller-Lyer illusion, can be explained through
integration. In the Muller-Lyer illusion, two lines with arrowheads pointing in opposite
directions are perceived as different in length, even though they are the same. The brain
integrates the arrowheads' angles and the lines' overall configuration, leading to the
misperception of their lengths.
8|Unit II Perception & Attention
In summary, the top-down approach emphasizes the role of higher-level cognitive
processes and context in shaping perception, while the integration approach focuses on how the
brain processes and combines sensory information to create our perceptual experiences. Both
approaches provide valuable insights into the complexities of human perception.
Computational Theory
Marr used the term "computational theory‟‟ to describe this aspect of his approach to
visual perception. The term emphatically does not mean a theory that is just "something to do
with computers". Instead, it expresses the specific and very powerful idea that the first stage in
understanding perception is to identify the information that a perceiver needs from the world,
and the regular properties of the world that can be incorporated into processes for obtaining that
information. In other words, we need to know what computations a visual system needs to
perform, before attempting to understand how it carries them out. Marr's application of
computational theory focus on the problems such as detecting the edges of surfaces, perceiving
depth, or recognizing objects. The approach has been widely influential.
One of the key issues addressed by Marr in Vision is the level of analysis that is used in
computational neuroscience. Marr favoured the top level, the computational theory level -
what is the goal of the computation; why is it appropriate; and what is the logic of the strategy by
which it can be carried out? He distinguished this from the second level, the representation
and algorithm - how can this computational theory be implemented? In particular, what is the
representation for the input and output, and what is the algorithm for the transformation? His
third level is hardware implementation - how can the representation and algorithm be realized
physically? In his earlier work on the cerebellar, neocortical and hippocampal theories, he had
included much on the third level, implementation in the brain, and this was being used to help
constrain the computational theory. But, perhaps partly for the reasons given above, in Vision he
strongly favoured the computational theory approach, suggesting that one should start here.
However, when understanding the cortical mechanisms of vision, what is found
neurophysiologically and in terms of the neuronal network architecture in the brain provides very
important constraints on the theory, whether this is of vision, memory, attention or decision
making. Thus a more modern approach, which is making very fast progress at present, is to
combine empirical neurophysiological and neuroanatomical data with approaches that produce
and test theories of how the brain computes. In turn, this strategy is informing a new approach to
neurological psychiatry that seeks to understand certain disorders of brain function (including
schizophrenia and obsessive compulsive disorder) by analysing the stochastic dynamics and
9|Unit II Perception & Attention
stability of cortical systems and this again relies on combining theory with empirical research.
Marr was certainly right in the following: without theoretical approaches being part of how we
understand brain function, we will never understand how vision works, or for that matter
memory, attention, decision making and some neuropsychiatric disorders of cortical function.
Marr's Theory: From primal sketch to 3-D models
Marr proposed three different levels for the understanding of information processing
systems (having vision systems as the target example): computational theory; representation
and algorithm; and hardware implementation. One of the Marr's most important contribution
was made in the level of representation and algorithm when he proposed a representational
framework for vision (Figure 1). He concentrated on the vision task of deriving shape
information from images.

Figure 1: Marr's representational framework

It is known that the intensities perceived by any visual system are a function of four main
factors: the geometry (meaning shape and relative placement); the reflectance of the visible
surfaces; the illumination; and the viewpoint. According to Marr's theory, the early visual system
derives representations in which these factors are separated. The first 2 representations in Marr's
framework, the primal sketch and the 21/2 -D sketch, are intended to essentially perform that
separation.
The detection of intensity changes, the representation and analysis of local geometric
structures and the detection of illumination effects take place in the process of generation of the
primal sketch. One important principle of the primal sketch is that independent spatial
organizations of the viewed intensities in a scene reflects the structure of the visible surfaces.
Marr proposed to capture these organizations by using a set of ``place tokens'', or low level
features, which correspond to oriented edges, bars, ends and blobs, which were represented by a
10 | U n i t I I P e r c e p t i o n & A t t e n t i o n
5-tuple: ( type, position, orientation, scale, contrast). The 21/2-D sketch is intended to represent
the orientation and depth of the visible surfaces as well as discontinuities. It is composed of some
local surface orientation primitives, distance from the viewer and discontinuities in depth and
surface orientation and, as in the previous representation, it is specified in a viewer-centered
coordinate system.
The last representation of the Marr's framework is the 3-D model representation. This
representation is intended to describe shapes and their organization using a modular and
hierarchical organization of volumetric and surface primitives (an example of the organization of
shape information in a 3-D model description can be seen in Figure 2). The recognition process
uses a catalogue of 3-D models which is a collection of stored 3-D model descriptions and
various indices into the collection that allow the association of a new description with the
appropriate one in the collection.

All 3-D model descriptions can be organized in a hierarchy according to the specificity of
information they carry. The top level of such a hierarchy is a model which does not have a
component decomposition and describes the model's principal axis. At the next level in the
hierarchy more details are added to the model, like the number and distribution of subcomponent
axes along the principal axis. At the lower levels each individual object's model receives more
precise descriptions, and they can now be distinguished by the angles and length of their
components.
There are three kinds of indices in the model catalogue: specificity index, adjunct index
and parent index. The specificity index supports the main recognition process which relates a
newly derived 3-D model to a model in the catalogue. The process starts at the top of the

11 | U n i t I I P e r c e p t i o n & A t t e n t i o n
hierarchy and searches down the levels through models whose descriptions are consistent with
the new model's descriptions until the precision of information in the new model and in the
catalogue's model have the same level of specificity. The adjunct and parent indices play a role
secondary to that of the specificity index and their purpose is to provide contextual constraints
that support the derivation process. The adjunct (or subcomponent) index comes from adjunct
relations in a 3-D model and provide access to 3-D models for its components based on their
locations, orientations and relative sizes. The inverse of the adjunct index is the parent (or
supercomponent) index. The idea behind the parent index is whenever a component of a shape is
identified, it can provide information about what the whole shape is likely to be.

Figure 2: An example of Marr's 3-D model description (taken from). Each box corresponds to a 3-D
model. The left side of a box contains the model axis and the right side contains the decomposition of
the model's component axes. For illustration purposes only, the relative position and orientation of a
model's component axes is incorrectly represented here in a viewer-centered coordinate system, rather
than in an object-centered one.

12 | U n i t I I P e r c e p t i o n & A t t e n t i o n
Attention
The concept of „attention‟ is an area of study under cognitive psychology. Attention
refers to one‟s ability to select and focus on relevant stimuli. In other words, it‟s how we actively
process information in our environment and tune out information, perceptions and sensations that
aren‟t relevant at the moment. For example, many people often work in their favorite coffee
spots. Although there are many distractions in a public place such as the crowd, the staff and
even the bustling noise of the traffic, people remain focused on their work. Their attention allows
them to concentrate on the things that are important to them. As you can see, attention can help
us focus as well as ignore information around us. In order to understand how attention works, let
us look at its key aspects:
Limited:
Attention is limited both in terms of capacity and duration
Selective:
Since attention is limited, we need to be selective about what and where to focus
Cognitive:
Attention is part of our cognitive system and aids in our ability to survive
Types of Attention
Attention is a dynamic phenomenon that changes according to the immediate
environment. It‟s a complex process that‟s rooted in various cognitive functions. Over the years,
researchers have identified various types of attention in psychology. Understanding the different
types of attention is the key to being more efficient. Before we dive deeper into the different
types of attention, let‟s look at the factors influencing these types.
Internal Factors:
They depend on your brain functions and cognitive resources, such as emotions, mindset
and interests
External Factors:
They depend on the characteristics of the stimuli in your surroundings. There are
several types of attention that we use during our daily activities.
Selective Attention
Every day we‟re exposed to various stimuli. Selective attention helps us navigate
complex settings. You select from various stimuli and focus on what you find important. Take
the workplace, for example. You are surrounded by coworkers and electronics which can act as

13 | U n i t I I P e r c e p t i o n & A t t e n t i o n
distractions. You use selective attention to focus on your work and keep the noise at bay. It‟s
safe to say that if you‟re good at selective attention, you‟re good at ignoring distractions and
concentrating on your priorities.
Sustained Attention
This is the ability to focus on something for long periods of time without being distracted.
In other words, you concentrate on time-consuming tasks by using sustained attention. There are
three stages of sustained attention:
 Paying Attention, When You Start To Focus
 Keeping Attention, When You Continue To Focus
 Ending Attention, When You Finally Stop Paying Attention
Students often employ sustained attention to study for examinations. You‟ve probably used
sustained attention for activities such as attending business meetings or conferences or preparing
business decks.
Divided Attention
When you focus on two or more things at the same time, you‟re using divided attention.
You‟re essentially dividing your attention between two or more tasks. This ability is also known
as multitasking. Divided attention uses focus on a very large scale—not allowing us to fully
focus on any one task. For example, you may have written an email while attending a webinar.
Divided attention doesn‟t last long because you split your attention between various tasks
and perform them at the same time. Multitasking is harmful as it affects your productivity in the
long run. You must divide your attention only when it is absolutely necessary.
Alternating Attention
Similar to divided attention, alternating attention involves shifting your focus and
switching between multiple tasks. However, unlike divided attention, you‟re not performing
multiple activities at the same time. Even when you switch your attention among various tasks,
you remain focused on the task at hand.
We use alternating attention more often than we realize. For example, you switch your
focus between taking notes and making sense of those notes during a meeting or presentation.
Selective Models of Attention
Selective attention is the process of directing our awareness to relevant stimuli while
ignoring irrelevant stimuli in the environment. This is an important process as there is a limit to
how much information can be processed at a given time, and selective attention allows us to tune
out insignificant details and focus on what is important. This limited capacity for paying
14 | U n i t I I P e r c e p t i o n & A t t e n t i o n
attention has been conceptualized as a bottleneck, which restricts the flow of
information. The narrower the bottleneck, the lower the rate of flow.

Broadbent's and Treisman's Models of Attention are all bottleneck models because they
predict we cannot consciously attend to all of our sensory input at the same time.
Broadbent's Filter Model
Broadbent (1958) proposed that physical characteristics of messages are used to select
one message for further processing and that all others are lost. Information from all of the stimuli
presented at any given time enters an unlimited capacity sensory buffer. One of the inputs is then
selected on the basis of its physical characteristics for further processing by being allowed to
pass through a filter.

Because we have only a limited capacity to process information, this filter is designed to
prevent the information-processing system from becoming overloaded. The inputs not initially
selected by the filter remain briefly in the sensory buffer store, and if they are not processed they
decay rapidly. Broadbent assumed that the filter rejected the unattended message at an early
stage of processing.

15 | U n i t I I P e r c e p t i o n & A t t e n t i o n
According to Broadbent the meaning of any of the messages is not taken into account at
all by the filter. All semantic processing is carried out after the filter has selected the message to
pay attention to. So whichever message(s) restricted by the bottleneck (i.e. not selective) is not
understood. Broadbent wanted to see how people were able to focus their attention (selectively
attend), and to do this he deliberately overloaded them with stimuli. One of the ways Broadbent
achieved this was by simultaneously sending one message to a person's right ear and a different
message to their left ear. This is called a split span experiment (also known as the dichotic
listening task).
Dichotic Listening Task
The dichotic listening tasks involves simultaneously sending one message (a 3-digit
number) to a person's right ear and a different message (a different 3-digit number) to their left
ear.

Participants were asked to listen to both messages at the same time and repeat what they
heard. This is known as a 'dichotic listening task'. Broadbent was interested in how these would
be repeated back. Would the participant repeat the digits back in the order that they were heard
(order of presentation), or repeat back what was heard in one ear followed by the other ear (ear-
by-ear). He actually found that people made fewer mistakes repeating back ear by ear and would
usually repeat back this way.
Evaluation of Broadbent's Model
1. Broadbent's dichotic listening experiments have been criticized because:

 The early studies all used people who were unfamiliar with shadowing and so found it
very difficult and demanding. Eysenck and Keane (1990) claim that the inability of naive

16 | U n i t I I P e r c e p t i o n & A t t e n t i o n
participants to shadow successfully is due to their unfamiliarity with the shadowing task
rather than an inability of the attentional system.
 Participants reported after the entire message had been played - it is possible that the
unattended message is analyzed thoroughly but participants forget.
 Analysis of the unattended message might occur below the level of conscious
awareness. For example, research by Von Wright et al (1975) indicated analysis of the
unattended message in a shadowing task. A word was first presented to participants with
a mild electric shock. When the same word was later presented to the unattended
channel, participants registered an increase in GSR (indicative of emotional arousal and
analysis of the word in the unattended channel).
 More recent research has indicated the above points are important: e.g. Moray (1959)
studied the effects of practice. Naive subjects could only detect 8% of digits appearing in
either the shadowed or non-shadowed message, Moray (an experienced 'shadower')
detected 67%.
2. Broadbent's theory predicts that hearing your name when you are not paying attention should
be impossible because unattended messages are filtered out before you process the meaning -
thus the model cannot account for the 'Cocktail Party Phenomenon'.
3. Other researchers have demonstrated the 'cocktail party effect' (Cherry, 1953) under
experimental conditions and have discovered occasions when information heard in the
unattended ear 'broke through' to interfere with information participants are paying attention to in
the other ear.
This implies some analysis of the meaning of stimuli must have occurred prior to the
selection of channels. In Broadbent's model, the filter is based solely on sensory analysis of the
physical characteristics of the stimuli.
Treisman's Attenuation Model
Treisman (1964) agrees with Broadbent's theory of an early bottleneck filter. However,
the difference is that Treisman's filter attenuates rather than eliminates the unattended material.
Attenuation is like turning down the volume so that if you have 4 sources of sound in one room
(TV, radio, people talking, baby crying) you can turn down or attenuate 3 in order to attend to
the fourth. This means that people can still process the meaning of the attended message(s).

17 | U n i t I I P e r c e p t i o n & A t t e n t i o n
In her experiments, Treisman demonstrated that participants were still able to identify the
contents of an unattended message, indicating that they were able to process the meaning of both
the attended and unattended messages. Treisman carried out dichotic listening tasks using the
speech shadowing method. Typically, in this method participants are asked to simultaneously
repeat aloud speech played into one ear (called the attended ear) whilst another message is
spoken to the other ear. For example, participants asked to shadow "I saw the girl furniture over"
and ignore "me that bird green jumping fee", reported hearing "I saw the girl jumping over"
Clearly, then, the unattended message was being processed for meaning and Broadbent's Filter
Model, where the filter extracted on the basis of physical characteristics only, could not explain
these findings. The evidence suggests that Broadbent's Filter Model is not adequate, it does not
allow for meaning being taken into account.
Evaluation of Treisman's Model
1. Treisman's Model overcomes some of the problems associated with Broadbent's Filter Model,
e.g. the Attenuation Model can account for the 'Cocktail Party Syndrome'.
2. Treisman's model does not explain how exactly semantic analysis works.
3. The nature of the attenuation process has never been precisely specified.
4. A problem with all dichotic listening experiments is that you can never be sure that the
participants have not actually switched attention to the so called unattended channel.
Capacity model of attention
Daniel Kahneman took a different approach to describe attention, by describing its
division, rather than selection mechanisms. He describes attention as a resource in which energy
or mental effort is required. Mental effort is used while engaging in performing any mental task,
and the greater the complexity, the greater the effort needed to solve a task. Kahneman believes
there are three basic conditions which needed to be met for proper completion of a task. By

18 | U n i t I I P e r c e p t i o n & A t t e n t i o n
combining total attentional capacity, momentary mental effort, and appropriate allocation policy
of the attentional capacity, a person will exert enough mental effort to overcome mental tasks.
The key component is allocating enough attention, as a resource, to the task at hand. Kahneman
also noted that arousal influences the total attentional capacity in any given situation. In
addition, his model incorporates the ideas of voluntary and reflexive attention, which affect
allocation policy. In order to direct attention appropriately, one must attend to relevant
information, while neglecting irrelevant information to prevent becoming distracted. This mental
effort theory proposed by Kahneman provides an overview of the influences and
interdependencies of attention allocation, which is meant to supplement attention selection
models. Kahneman identifies his theory as a capacity theory of attention, meaning: (1) attention
is not an unlimited resource and (2) attention is a shared resource.
Wickens’ Multiple Resource Model
Wickens‟ multiple resource theory suggests that several different „cognitive resources‟
can be used simultaneously. Cognitive resources are represented by boxes in figure 1.
Figure 1: Wickens’ multiple resource theory (MRT) model

In basic terms the theory proposes that when different tasks require the same cognitive
resource, e.g. visual perception, information must be processed in sequence. When the task
requires different resources, e.g. visual perception and auditory perception, they can be processed
simultaneously.

19 | U n i t I I P e r c e p t i o n & A t t e n t i o n
The 4-dimensional model
There are 4 dimensions used in the model (as shown in fig 1): stages, processing codes,
input and visual channels.
1. Stages
In the MRT, dual-task interference can be affected depending on whether multiple tasks
require cognitive/perceptual activities or response activities. These activities represent stages of
the model and are seen as a dichotomy in the sense that dual tasks requiring the same stages are
prone to greater interference then in dual tasks where one task requires a cognitive activity and
the other a response.
Figure 2. Representation of two resources, supplying the different stages of information
processing. Sensory processing, the operation of the peripheral visual and auditory
systems, is relatively resource-free (automatic).

In figure 2, Perceptual and cognitive activities share the same resources and are
functionally separate to the processes used to select and execute a response. Evidence for the
stages come from Shallice et als 1985 study on dual-task performance where they found that
speech and motor activity (responses) are often controlled in the frontal regions of the brain (in
particular, the central sulcus) while perceptual and language comprehension tend to be
undertaken in the posterior section of the central sulcus. This indicates that the stage dichotomy
can be associated with different brain structures.
Wickens (2002) uses a concrete example to represent this: “the added requirement for an
air traffic controller to acknowledge vocally or manually each change in aircraft state (a response
demand) would not disrupt his or her ability to maintain an accurate mental picture of the
airspace (a perceptual-cognitive demand).”
Wickens also postulates that the stages predict that interference is likely to be great
between resource-demanding perceptual tasks and cognitive tasks involving working memory to

20 | U n i t I I P e r c e p t i o n & A t t e n t i o n
store or transform information. Despite using different information processes, they are supported
by common resources. An example of this can be seen in performing dual tasks such as visually
searching while mentally rotating an object or understanding speech while rehearsing a speech.
2. Processing codes
Processing codes refer to separate resources used in analogue/spatial processes and
categorical/symbolic processes (verbal). The model further postulates that these resources are
separate and distinct across the 3 stages of perception, cognition and responses.
Wickens postulates that the separation of resources may account for the lack of
interference that may occur when manual and vocal responses are time-shared. In the model,
manual responses are seen as spatial in nature (e.g. tracking or steering) while vocal responses
are verbal (e.g. speaking). This allows the model to predict when it may or may not be useful to
employ voice vs manual control.
Wickens and Liu (1988) found that manual control may disrupt performance in a task
environment imposing demands on spatial working memory (e.g. in driving, dialling a phone
number and steering) whereas voice control may disrupt performance of a task with heavy verbal
demands (or be disrupted by depending on the resource allocation)
3. Input (modalities)
In the MRT, Wickens talks about perceptual modalities which are visual (V), audio (A),
tactile and olfactory modalities used in time sharing tasks. In particular, there is a dichotomy
between tasks that utilise separate modalities or tasks that require one modality. These are
referred to as cross-modal time sharing (e.g. AV) or intra-modalities (e.g. AA or VV).
The model predicts that there will generally be less interference when using cross-
modalities rather than intra-modalities as a result of separate perceptual resources being used at
the same time. Wickens however, is uncertain of whether this is really the case and points out
that the advantage of cross-modal time-sharing may instead be as a result of peripheral factors on
intra-modal conditions causing confusion or masking. For example, tasks that require two
competing visual channels-if far apart- require visual scanning between them, if the tasks uses
visual channels that are closer to one another, they may cause confusion and masking. The same
is true of dual tasks requiring listen to two messages simultaneously.
There is research that has found that non-resource factors may contribute to intra-modal
advantage such as the process of attention (I.e. knowing what to look for in two tasks) or „pre-
emptive‟ characteristics of auditory information . (Wickens and Liu 1988). Regardless, it can be
inferred from the MRT that dual-task interference can generally be reduced by having tasks that
21 | U n i t I I P e r c e p t i o n & A t t e n t i o n
uses one visual modality and one auditory modality. However, there are exceptions to this in
cases where two displays may be more practical than one display and one auditory.
4. Visual channels
The MRT proposes that there are two visual channels used in visual processing. Focal
and ambient. These are said to use separate resources which are characterised by a) the location
within the brain where processing occurs and the type of processing that is undertaken. The
model infers that dual tasks involving one focal and one ambient processing will result in little
interference.
Focal vision is linked to eye movement, used for fine details and pattern recognition. It is
used in tasks involving visual search, object recognition and other tasks requiring high visual
acuity (e.g. reading text).
Ambient vision involves use of peripheral vision and is used for sensing one‟s
orientation and motion in the environment. Examples of dual tasks which use both channels
include walking down a corridor (ambient) while reading a book (focal) or keeping a car in the
centre of a lane (ambient) while reading a road sign or looking at the rear view mirror (focal).
Practical application
The theory allows us to predict behaviour in multitasking activities e.g. reading and map
and talking on the phone while driving. It is particularly useful for predicting dual-task
interference compared with earlier cognitive „filtering‟ models, e.g. Broadbent 1958.
Theoretical application
MRT is closely related to both attention and workload; the “multiple” aspect relates to
attention and the “resource” aspect relates to workload.

********************************************

22 | U n i t I I P e r c e p t i o n & A t t e n t i o n

You might also like