Mark E. Bouton - Learning and Behavior - A Contemporary Synthesis-Sinauer Associates Is An Imprint of Oxford University Press (2016)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 576

Learning and Behavior

A Contemporary Synthesis
Second Edition
Learning and Behavior
A Contemporary Synthesis
Second Edition

Mark E. Bouton
University of Vermont

Sinauer Associates, Inc. • Publishers


Sunderland, Massachusetts 01375
The Cover
Composition No. II; Composition in Line and Color
Piet Mondrian
1913
Oil on canvas
Kroeller-Mueller Museum, Otterlo
© World History Archive/Alamy Stock Photo.

Learning and Behavior: A Contemporary Synthesis,


Second Edition
Copyright © 2016 by Sinauer Associates, Inc. All rights reserved. This
book may or may not be reproduced in whole or in part without the
permission of the publisher. For information address:

Sinauer Associates, Inc.


23 Plumtree Road
Sunderland, MA 01375
U.S.A.

FAX: 413-549-4300
[email protected]; [email protected]

Library of Congress Cataloging-in-Publication Data

Names: Bouton, Mark E., author.


Title: Learning and behavior: a contemporary synthesis / Mark E.
Bouton,
   University of Vermont.
Description: Second edition. | Sunderland, Massachusetts: Sinauer
   Associates, Inc., [2016] | Includes bibliographical references and index.
Identifiers: LCCN 2016003962 | ISBN 9780878933853 (pbk.)
Subjects: LCSH: Learning, Psychology of. | Cognition. | Conditioned
response.
Classification: LCC BF318 .B675 2016 | DDC 153.1/5--dc23
LC record available at Library of Congress Cataloging-in-Publication
Data

Printed in U.S.A.
6 5 4 3 2 1
For Suzy, Lindsay, and Grace
And for my mother and father
Table of Contents

Preface xiii

Chapter 1
Learning Theory  What It Is and How It Got This Way 3
Philosophical Roots  5 Computer and brain metaphors  22
Are people machines?  5 Human learning and animal learning  25
Associations and the contents of the Tools for Analyzing Learning and
mind 7 Behavior 27
Biological Roots  9 Learning about stimuli and about
Reflexes, evolution, and early behavior 28
comparative psychology  9 Crows foraging at the beach  29
The rise of the conditioning Human eating and overeating  31
experiment 12 Kids at play  31
People using drugs  33
A Science of Learning and
Relations between S, R, and O  33
Behavior 14
John B. Watson  14 Summary 35
B. F. Skinner  17 Discussion Questions  37
Edward C. Tolman  20
Key People and Key Terms  38
viii  Table of Contents

Chapter 2
Learning and Adaptation  41
Evolution and Behavior  42 Territoriality and reproduction  56
Natural selection  42 Fear 59
Adaptation in behavior  42 Conditioning with drugs as the
Fixed action patterns  44 outcome 60
Innate behavior  45 Sign tracking  63
Habituation 47 Other Parallels Between Signal
Adaptation and Learning: and Response Learning  64
Instrumental Conditioning  50 Extinction 64
The law of effect  51 Timing of the outcome  66
Reinforcement 52 Size of the outcome  69
Shaping 52 Preparedness 70
Adaptation and Learning: Summary 74
Classical Conditioning  54 Discussion Questions  75
Signals for food  55 Key Terms  76

Chapter 3
The Nuts and Bolts of Classical Conditioning  79
The Basic Conditioning Conditioned Inhibition  97
Experiment 80 How to produce conditioned
Pavlov’s experiment  80 inhibition 97
What is learned in conditioning?  81 How to detect conditioned inhibition  98
Variations on the basic experiment  83 Two methods that do NOT produce
Methods for Studying Classical true inhibition  100
Conditioning 84 Information Value in
Eyeblink conditioning in rabbits  85 Conditioning 101
Fear conditioning in rats  86 CS-US contingencies in classical
Autoshaping in pigeons  87 conditioning 101
Appetitive conditioning in rats  89 Blocking and unblocking  103
Taste aversion learning  90 Overshadowing 106
Relative validity in conditioning  106
Things That Affect the Strength
of Conditioning  90 Summary 109
Time 91 Discussion Questions  110
Novelty of the CS and the US  93
Key Terms  111
Intensity of the CS and the US  94
Pseudoconditioning and sensitization  95
Table of Contents   ix

Chapter 4
Theories of Conditioning  113
The Rescorla-Wagner Model  114 Short-Term Memory and
Blocking and unblocking  117 Learning 136
Extinction and inhibition  119 Priming of the US  138
Other new predictions  122 Priming of the CS  138
CS-US contingencies  125 Habituation 141
What does it all mean?  127 What does it all mean?  142
Some Problems with the Nodes, Connections, and
Rescorla-Wagner Model   128 Conditioning 143
The extinction of inhibition  128 Wagner’s “SOP” model  144
Latent inhibition  128 Sensory versus emotional US nodes  148
Another look at blocking  129 Elemental versus configural CS
The Role of Attention in nodes 150
Conditioning 130 What does it all mean?  153
The Mackintosh model  130 Summary 154
The Pearce-Hall model  132 Discussion Questions  156
A combined approach  134
What does it all mean?  135
Key Terms  157

Chapter 5
Whatever Happened to Behavior Anyway?  159
Memory and Learning  160 Other forms of modulation  186
How well is conditioning What does it all mean?  187
remembered? 160 Understanding the Nature of the
Causes of forgetting  163 Conditioned Response  187
Remembering, forgetting, and Two problems for stimulus
extinction 166 substitution 188
Other examples of context, ambiguity, Understanding conditioned
and interference  171 compensatory responses  190
Can memories be erased?  173 Conditioning and behavior systems  193
Interim Summary  177 What does it all mean?  197
The Modulation of Behavior  177 Conclusion 199
Occasion setting  178 Summary 200
Three properties of occasion setters  181
What does it all mean?  183
Discussion Questions  201
What is learned in occasion setting?  184 Key Terms  202
Configural conditioning  186
x  Table of Contents

Chapter 6
Are the Laws of Conditioning General?  205
Everything You Know Is The generality of relative validity  222
Wrong 206 Associative Learning in
Special Characteristics of Flavor Honeybees and Humans  225
Aversion Learning  208 Conditioning in bees  225
One-trial learning  208 Category and causal learning in
Long-delay learning  209 humans 228
Learned safety  211 Some disconnections between
Hedonic shift  213 conditioning and human category and
Compound potentiation  216 causal learning  233
Conclusion 220 Causes, effects, and causal power  237
Conclusion 241
Some Reasons Learning Laws
May Be General  220 Summary 242
Evolution produces both generality and Discussion Questions  243
specificity 220 Key Terms  243

Chapter 7
Behavior and Its Consequences  245
Basic Tools and Issues  246 Behavioral economics: Are reinforcers all
Reinforcement versus contiguity alike? 272
theory 246 Theories of Reinforcement  276
Flexibility, purpose, and motivation  249 Drive reduction  276
Operant psychology  252 The Premack principle  277
Conditioned reinforcement  254 Problems with the Premack principle  280
The Relationship Between Behavioral regulation theory  282
Behavior and Payoff  257 Selection by consequences  284
Different ways to schedule payoff  257 Summary 288
Choice 260
Discussion Questions  289
Choice is everywhere  264
Impulsiveness and self-control  266 Key Terms  291
Nudging better choices  271
Table of Contents   xi

Chapter 8
How Stimuli Guide Instrumental Action  293
Categorization and Working memory  326
Discrimination 295 Reference memory  332
Trees, water, and Margaret  296 The Cognition of Time  335
Other categories  298 Time of day cues  335
How do they do it?  301 Interval timing  336
Basic Processes of Generalization How do they do it?  340
and Discrimination  305 The Cognition of Space  343
The generalization gradient  306 Cues that guide spatial behavior  343
Interactions between gradients  309 Spatial learning in the radial maze and
Perceptual learning  313 water maze  346
Mediated generalization and acquired How do they do it?  349
equivalence 317 Metacognition 355
Conclusion 320
How do they do it?  358
Another Look at the Information Summary 359
Processing System  320
Visual perception in pigeons  321 Discussion Questions  360
Attention 325 Key Terms  361

Chapter 9
The Motivation of Instrumental Action  363
How Motivational States Affect Motivation by expectancies  387
Behavior 364 General and specific outcome
Motivation versus learning  364 expectancies 391
Does Drive merely energize?  366 What does it all mean?  394
Is motivated behavior a response to Dynamic Effects of Motivating
need? 371 Stimuli 396
Anticipating Reward and Opponent-process theory  396
Punishment 376 Emotions in social attachment  399
Bait and switch  376 A further look at addiction  401
The Hullian response: Incentive Conclusion 404
motivation 379 Summary 405
Frustration 380
Discussion Questions  407
Another paradoxical reward effect  382
Partial reinforcement and Key Terms  408
persistence 384
xii  Table of Contents

Chapter 10
A Synthetic Perspective on Instrumental Action  411
Avoidance Learning  412 A general role for stimulus learning in
The puzzle and solution: Two-factor response learning situations  440
theory 412 Punishment 442
Problems with two-factor theory  415 Summary: What does it all mean?  445
Species-specific defense reactions  420 A Cognitive Analysis of
Cognitive factors in avoidance Instrumental Action  445
learning 426 Knowledge of the R-O relation  446
Learned helplessness  431 Knowledge of the S-O relation  452
Summary: What does it all mean?  436 S-(R-O) learning (occasion setting)  454
Parallels in Appetitive S-R and “habit” learning  456
Learning 436 Summary 461
The misbehavior of organisms  436
Discussion Questions  463
Superstition revisited  437
Key Terms  464

Glossary 465
References 481
Author Index  529
Subject Index  539
Preface
The Second Edition of this book has been thoroughly updated, but retains
the outline and structure of the First Edition. After the first three chapters
introduce the history of the field of Learning Theory and its basic findings
and concepts (within a functional framework), the remaining chapters pro-
vide what I hope are interesting story lines that keep the reader engaged
and explain the intellectual context of developments in the field. Chapter
4 covers the major theories of classical conditioning beginning with the
Rescorla-Wagner model. I find that students feel especially rewarded when
they master this material, and I believe that those who haven’t been ex-
posed to it may be at a disadvantage if they want to apply knowledge in
the field to other parts of psychology or the world at large. Chapter 5 then
explores how learning gets translated back into behavior. It covers memory
retrieval, extinction, reconsolidation, occasion setting, and behavior sys-
tems, along with other topics. Chapter 6 considers the challenge (created
by the discovery of taste aversion learning) that the principles and theories
of learning developed in the learning lab might not generalize very widely.
Along the way, we get a chance to think more specifically about associa-
tive learning in honeybees and humans. Throughout, the book focuses on
ideas, their interconnectedness, and their evaluation and improvement
through empirical research.
The last four chapters turn more specifically to understanding voluntary
behavior. After considering the classic ideas of Thorndike, Guthrie, and Tol-
man, Chapter 7 discusses material that will be sought by instructors with an
interest in behavior analysis; it covers key topics in operant learning as well
as modern perspectives on choice, reinforcement, delay discounting, and
behavioral economics. The idea is again to show how the research builds
and interconnects. Chapter 8, on stimulus control and animal cognition,
begins with a discussion of categorization in pigeons, which justifies a look
xiv  Preface

at more foundational research on generalization and discrimination. It then


proceeds to cover topics in perception, attention, and memory in animals,
the cognition of time and space, and finally metacognition. One of the goals,
again, is to show how our understanding of these topics interconnects.
Chapter 9, on the motivation of instrumental behavior, tells another story
that begins with the question of how motivational states affect behavior
and then how expectancies and predictive cues also motivate. The chapter
also covers addiction—although topics related to addiction are addressed
throughout the book. The final chapter, Chapter 10, provides the current
“synthetic” approach to instrumental learning and behavior. It begins by
considering avoidance learning, learned helplessness, misbehavior in ani-
mals, and today’s “cognitive” analysis of instrumental learning. All of this
provides an opportunity to reconsider and integrate some of the book’s
major themes. My hope is that the reader will leave the book with an ap-
preciation of how all the parts fit back together. One of the nicest things
that was ever said to me about the First Edition was that it “reads like a
novel.” I hope that telling good stories that link the issues together will
help the reader enjoy and understand the concepts more deeply while also
appreciating how good science works.
One of the pleasures of writing the Second Edition was having the
chance to catch up on so much excellent research. I still believe that mod-
ern Learning Theory provides a perspective and vocabulary that is highly
useful to all psychologists and behavioral neuroscientists. In addition to
making the complexities of the field accessible to students, I hope the book
will convey its intellectual excitement to anyone who reads it. Through-
out, I also hope the reader will find enough ideas about how the concepts
can be applied to real-world issues to help illustrate why basic research is
worthwhile. Integration and application are also emphasized in the new
Discussion Questions that are included at the end of every chapter.

Acknowledgments
Writing the book depended on many interactions and discussions with
far too many friends and kindred spirits to name here. Bernard Balleine
and Merel Kindt were hosts during a recent sabbatical leave and provided
helpful feedback on some new sections. Vin LoLordo provided sound ad-
vice and feedback all along. I also benefited from comments on individual
chapters provided by my students and former students, Cody Brooks,
Byron Nelson, Scott Schepers, Jay Sunsay, Eric Thrailkill, Travis Todd, and
Sydney Trask. I also want to thank a number of others who commented
on chapters from the first edition: Aileen Bailey, Bob Batsell, Kristin Bion-
dolillo, David Bucci, Allison Deming, Michael Emond, Dennis Jowaisas,
Jonathan Kahane, Richard Keen, John Kelsey, Henry Marcucella, Ronald
Miller, Michael Serra, Amanda Shyne, Janice Steirn, Chris Sturdy, Brian
Thomas, Lucy Troup, Sheree Watson, Cedric Williams, and Brian Wiltgen.
Preface  xv

As I said in the preface to the First Edition, the warts that remain in the
final product are my fault, and not theirs.
At Sinauer Associates, my Editor, Sydney Carroll, kept the author and
the project going with great understanding, warmth, and humor. Katha-
leen Emerson masterfully organized development of the new colorized art
program and with the assistance of Alison Hornbeck, kept the production
moving forward. Christopher Small, Joanne Delphia, and Beth Roberge
Friedrichs are responsible for the wonderful design and “feel” of the book.
I am still indebted to two great teachers: Roger M. Tarpy, who taught my
first Learning course, and Robert C. Bolles, who was a wise and inspiring
PhD advisor and friend.
My writing was indirectly supported by the Robert B. Lawson Green
and Gold Professorship at the University of Vermont, a Visiting Professor-
ship at the University of Amsterdam, a residence at the Brain and Mind
Research Institute at the University of Sydney, and by research grants from
the National Institutes of Health.
For all of this help and support, I am grateful.

Mark E. Bouton
Burlington, Vermont
February, 2016
Media and Supplements
to accompany Learning
and Behavior, Second Edition
For Students
Companion Website (www.sinauer.com/bouton2e)
The Learning and Behavior, Second Edition companion website includes
resources to help students learn and review the content of each chapter
and test their understanding of the concepts presented in the textbook.
The site includes the following resources:
• Chapter Outlines
• Chapter Summaries
• Flashcards
• Glossary
• Online Quizzes (Adopting instructors must register online in order
for their students to use this feature)

For Instructors
Instructor’s Resource Library
The Learning and Behavior Instructor’s Resource Library includes the fol-
lowing resources:
• Textbook Figures & Tables: All of the textbook’s figures (including
photos) and tables are provided in both JPEG (high- and low-
resolution) and PowerPoint formats. All images have been
formatted and optimized for excellent legibility when projected.
• NEW! Lecture Presentations: New for the Second Edition, a
complete, ready-to-use lecture presentation is provided for each
chapter. These presentations cover all of the important material in
each chapter and include selected figures and tables.
xviii  Media and Supplements

• Instructor’s Manual: The Instructor’s Manual includes the following


sections for each chapter of the textbook:
• Chapter Outline
• Learning Objectives
• Class Discussion and Critical Thinking Exercises
• Suggested Additional Resources
• Key Terms
• Test Bank: A comprehensive set of exam questions is provided for
each chapter of the textbook, in both multiple choice and short
answer formats (companion website online quiz questions also
included). New for the Second Edition, each question is referenced
to Bloom’s Taxonomy and to textbook sections. The Test Bank is
provided in several formats:
• Word files, by chapter
• Diploma test creation program (software included).
Diploma makes it easy to create quizzes and exams using
any combination of publisher-provided questions and
your own questions. Diploma also exports to a wide
range of formats for import into learning management
systems such as Blackboard, Moodle, and Desire2Learn.
• Blackboard files, for easy import into your Blackboard
course

Online Quizzing
The online quizzes that are part of the Learning and Behavior Compan-
ion Website include an instructor administration interface that allows
the quizzes to be used as assignments. Instructors also have the ability
to create their own quizzes and add their own questions. (Adopting
instructors must register with Sinauer Associates in order for their stu-
dents to be able to access the quizzes.)
Learning and Behavior
A Contemporary Synthesis
Second Edition
Chapter Outline
Philosophical Roots  5 Human learning and animal learning  25
Are people machines?  5 Tools for Analyzing Learning and
Associations and the contents of the
mind 7 Behavior 27
Learning about stimuli and about
Biological Roots  9 behavior 28
Reflexes, evolution, and early Crows foraging at the beach  29
comparative psychology  9 Human eating and overeating  31
The rise of the conditioning Kids at play  31
experiment 12 People using drugs  33
Relations between S, R, and O  33
A Science of Learning and
Behavior 14 Summary 35
John B. Watson  14
Discussion Questions  37
B. F. Skinner  17
Edward C. Tolman  20 Key People and Key Terms  38
Computer and brain metaphors  22
chapter

1
Learning Theory
What It Is and How It Got This Way

M ost people holding this book are at least a little familiar


with Learning Theory. The topic is often mentioned in
many survey and introductory courses in psychology.
It is also part of the popular culture; cartoonists, for
example, have mined it very well (Figure 1.1). My goal
in this first chapter is to give you information about
what the field is really like, why it is useful, and how it
got to be what it is today.
Psychology’s interest in learning does not need
much introduction because there is little doubt that
learning is crucial in our lives. You have been learning
in school since you were very young, of course. But
learning is even more pervasive and important than
that—it is truly happening all the time. When you got
up in the morning, you already knew where to find the
coffee, how to create the perfect mix of hot and cold
water in the shower, and where to find your jacket if it
looked like it was cold outside. As you walked to class,
you knew the route, the location where the sidewalk
was recently closed because of repairs, the people you
might encounter along the way, and so forth. On trips
to the snack bar, you probably also knew the types
of food that would be available, the ones you like the
best, where to find the soft drinks, and where your
friends might be. All this knowledge is based on your
past experience. Learning is always in progress—al-
ways helping us adapt to our environment. As Spreat
and Spreat (1982, p. 593) put it: “Much like the law
of gravity, the laws of learning are always in effect.”
4  Chapter 1

Figure 1.1  How the layman views Learning


Theory. (From ScienceCartoonsPlus.com.)

Not so easy for many people to understand are the methods psycholo-
gists often use to study learning. This book is really about the field in which
scientists often investigate learning by studying the behavior of animals
like rats and pigeons in laboratories equipped with mazes and Skinner
boxes. Nowadays, these methods are applied to a range of topics that might
surprise you. For example, Watanabe, Sakamoto, and Wakita (1995) used
them to ask how pigeons learn to categorize works of art, specifically paint-
ings by Monet and Picasso. Other experimenters (e.g., Crombag & Shaham,
2002; Marchant, Li, & Shaham, 2013) have used them to study how behavior
reinforced by taking drugs like heroin and cocaine can be treated and still
be vulnerable to relapse. These topics, and many others that are connected
with them, will be covered throughout this book. For now, though, I want
to note that how psychologists first came to see experiments with animals
in learning laboratories—as connected to the world at large—is itself a
rather interesting and colorful story. The main purpose of this chapter is to
relate that story. Part of that story involves how Learning Theory (what I
call the field that investigates learning and behavior principles by studying
animals learning in the lab) got started and evolved into what it is today. A
second purpose is to give you a frame of reference for understanding the
rest of the book as well as the field’s usefulness outside the laboratory. I
decided to write the book because I think that Learning Theory is as central
to understanding human and animal behavior as it ever was. Fortunately,
I like to talk and write about it, too.
The story of how things became this way started a few hundred years
ago, when philosophers were worrying about the nature of human na-
ture and the nature of the human mind. As modern science began to ma-
ture, such questions and issues were put into a scientific perspective. By
the 1800s, biology was beginning to provide some rather interesting new
Learning Theory   5

answers. There was a startling new idea: People had evolved. Learning
Theory as we know it today was launched in the 1880s and 1890s, when
people set out to study a major implication of the theory of evolution: that
the human mind had evolved. Let’s start by looking at some of the early
ideas about human nature and the human mind.

Philosophical Roots
Are people machines?
In the 1600s, science underwent a major renaissance. Thanks to scientists
like Galileo and Newton, there was an exciting new understanding of me-
chanics, of how physical things like planets or billiard balls move and
interact. Craftspeople began to make better and better machines (Figure
1.2). For example, clocks became more intricate and accurate than ever
before. By the 1600s, the kind of clock one can still see in village squares
in Europe—with dolls chasing animals or ringing bells on the hour and so
forth—were fairly common. It was probably inevitable that people began
comparing themselves to these early robots and mechanical devices. Are
humans simply complex machines? What makes us different from the dolls
that dance and whir every hour on the hour?
Today, we are more likely to compare ourselves
to computers, but then, mechanical devices
reigned supreme. Is it possible to understand
human action from mechanical principles?
One person who famously considered these
questions—and also came up with a famous an-
swer—was René Descartes (1596–1650; Fig-
ure 1.3A). He said, in effect, that human beings
are indeed like machines, but only partly so.
Like other philosophers before him, Descartes
distinguished between the human mind and
body. He suggested that the body was an ex-
tension of the physical world, a machine that is
governed by physical principles like dolls and
clockworks. But every human also has a mind,
a spiritual, godlike thing that is the source of
free will and all voluntary behavior. The mind is
what makes humans more than mere machines.
It also separates humans from animals. Animals
are pure body, without mind and without free
will; their actions are governed by simple me-
Figure 1.2  Illustration from a machinery
chanical principles. book first published in 1661 devoted to
Descartes did more than merely suggest pumps, presses and printing, and milling
the mind-body distinction. He also proposed machinery by the Nuremberg architect
a mechanistic principle, called reflex action, Böckler. (Illustration © Timewatch Images/
that was supposed to explain the body’s activ- Alamy.)
6  Chapter 1

(A) (B)

Figure 1.3  (A) René Descartes, who wondered whether humans were machines,
and (B) came up with the concept of reflex action. (A, image courtesy of Na-
tional Library of Medicine; B, illustration reproduced in Boakes, 1984.)

ity. For every action of the body, there is a stimulus that makes it happen
(see Figure 1.3B). The child puts her hand in a fire, and the fire causes her
hand to withdraw. The doctor drops the hammer on your knee, and your
leg moves. There is a simple, automatic connection between stimulus and
response. Descartes suggested that the stimulus agitated “animal spirits”
that traveled up the nerves (basically, hollow tubes) and made the muscles
swell—remember, it was 1637—but the larger idea that a reflex connects
a stimulus and response went on to have an enormous impact on biology
and psychology.
For all the importance of reflex action in human behavior, according
to Descartes, the mind still ruled—it could always intervene and modi-
fy a reflex. However, other thinkers were not as shy about claiming that
all human behavior follows scientific principles. For example, Thomas
Hobbes (1588–1679) argued that even the mind follows physical laws. He
suggested that all human thought and action is governed by hedonism,
the pursuit of pleasure and avoidance of pain. (This principle, familiar to
most college students, is what reinforcement theory is really all about.)
There was also Julien de la Mettrie (1709–1751), who saw more similarity
between humans and animals than Descartes did. Once, while de la Mettrie
had a fever, he realized that the body actually affects the mind. Fevers can
affect your thoughts, and so can wine or coffee. De la Mettrie’s book was
entitled Man a Machine (1748). By the 1700s, the idea that human nature,
and perhaps the human mind, could be understood by scientific principles
was launched and on its way.
Learning Theory   7

(A) (B)

Figure 1.4  Two famous British Empiricists: (A) the English philosopher, John
Locke, who founded the school of empiricism, and (B) David Hume, a Scottish
philosopher and historian. Both believed that the mind is an empty, passive
thing that receives and associates sense impressions. (A, image courtesy of
National Library of Medicine; B, image © Classic Image/Alamy.)

Associations and the contents of the mind


At roughly the same time, a group of philosophers in Britain was think-
ing about the things the mind contains. Collectively, these philosophers
are now known as the British Empiricists. Two of the most famous were
John Locke (1632–1704; Figure 1.4A) and David Hume (1711–1776; Figure
1.4B). Of course, the mind as we know it is full of ideas. The Empiricists
emphasized that all ideas and knowledge are built up entirely from experi-
ence, a view known as empiricism. Thus, according to Locke, the mind is a
blank slate (tabula rasa) at birth, ready to be written on by experience. It is
obvious that an empiricist view is important in the psychology of learning;
it gives us a reason to investigate how experience shapes and changes us.
The British Empiricists took a very atomistic approach to the contents
of the mind; they believed that the mind receives only simple sensations
and that these simple inputs are combined to build up all complex ideas.
For example, Locke argued that when you see an apple, you actually see
a collection of sense impressions—an apple is red, round, and shiny. With
experience, these impressions are combined to form the complex idea of an
apple. Hume emphasized the importance of associations between ideas. If
we eat the apple, it might taste crisp and sweet. These impressions become
further associated with the apple’s visual properties; you can begin to see
the complexity of the idea of an apple. But the complexity is built up out
of simple parts—complicated ideas and trains of thought are constructed
8  Chapter 1

from simple sense impressions and the associations be-


tween them.
Not surprisingly, later Empiricists (sometimes known
as Associationists) wrote more about how associations
are formed and how they operate—the so-called laws of
association. Hume himself argued that the “contiguity”
between ideas is important; two ideas will be associated
if they occur closely together in time. The association-
ists also argued that impressions will become associated
if they are similar, if they are lively, if the impressions
are “dwelt upon” for increasing amounts of time, and if
they have occurred together recently. With more and more
laws of association, a fairly complete theory of learning
and memory began to take shape. In fact, we will see that
many of these ideas are still a part of learning theory. Even
Figure 1.5  Immanuel Kant, our interest in learning itself stems from our belief that
according to whom the mind experience is important. You can often see an atomistic,
is an active thing that molds associationist bias as well. Many researchers study learn-
experience in part according ing in situations in which animals have an opportunity to
to inborn assumptions. (Image
associate simple events. Out of this, more complex things
courtesy of National Library of
Medicine.)
are expected to emerge.
There has almost always been an alternative to empiri-
cism, though. Some call it rationalism (e.g., Bower & Hil-
gard, 1981). This point of view is often associated (sorry!) with Immanuel
Kant (1724–1804; Figure 1.5), a German philosopher who had a different
perspective on the content and activity of the mind. Kant agreed with the
British Empiricists that a lot of knowledge comes from experience—but not
all of it, he argued. In contrast to the Empiricists, Kant believed that some
things do exist in the mind before experience writes on it. The mind has an
inherent set of assumptions or ideas called “a prioris” that help mold and
organize experience. For example, we know that objects have substance. We
also know that things happen in space and in time. We also know causality,
that is, that some things cause other things to happen. According to Kant,
experience does not make us think this way. Rather, the mind comes into
the world prepared to do it. The mind is not a blank slate, as Locke would
have it, but more like an old-fashioned floppy disk for a computer (Bolles,
1993): before anything useful could be written on it, it had to be formatted
and given a structure; otherwise, the computer could not read it.
The influence of both empiricism and rationalism can still be seen in
modern research. For example, in several places in this book we will discuss
how animals like rats quickly learn to “dislike” and reject a food if they
become sick a few hours after first ingesting it. This phenomenon (called
taste aversion learning; see Chapter 2) had a profound influence on theories
of learning when it was first noticed in the 1960s (see especially Chapter 6).
One reason is that although the rat is quick to associate a flavor with illness,
he is not so ready to associate a sound or a light or even a texture with the
Learning Theory   9

illness (e.g., Domjan & Wilson, 1972; Garcia & Koelling, 1966; see Chapter
2). To put it casually, it is as if when the rat gets sick, he exercises the a priori
assumption that “it must have been something I ate.” Rats behave as if they
blame taste over other cues for illness even when they are only one day old
(Gemberling & Domjan, 1982). At this age, experience has not had much
time to write on the rat pup’s tabula rasa. Thus, the bias seems to be inborn,
much the way Kant would have expected it. Although Kant would have
assumed that God designed the rat to “think” this way, today we assume
that evolution had a hand in it and that evolution has important effects on
learning (see, e.g., Chapters 2, 6, and 10). This point brings us to the other
major input to modern learning theory, namely, the input from biology.

Biological Roots
Reflexes, evolution, and early comparative psychology
As relevant as the philosophical roots may be, the most direct impetus to
Learning Theory was biology. By the middle of the 1800s, physiologists
were making impressive progress figuring out how reflexes work. Des-
cartes (himself a philosopher and mathematician) had convinced every-
one that reflex action was important; but from roughly the 1750s onward,
biologists really began to make significant scientific progress. They were
discovering how organized reflexes actually are. They were also discov-
ering that electricity (itself a new discovery) is involved in reflex action.
They were even beginning to estimate the speed of neural transmission.
In 1866, Ivan Sechenov (1829–1905), a physiologist from Russia who
had studied reflexes with all the great European scientists, put many of
these ideas together. He wrote a book entitled Reflexes of the Brain. In it, he
argued that mental processes could be analyzed in terms of physiological
mechanisms, namely the ones involved in reflexes. Thoughts, he argued,
are responses—reflexive responses to stimuli. He noted that when going
to sleep at night, he might think of the emperor of China. In the daytime,
if he happened to lie down in bed, he might automatically think of the
emperor again. His book also emphasized inhibition, a reflex phenomenon
discovered in the 1800s. A reflex could be bottled up and inhibited, wait-
ing to be triggered by some tiny stimulus. In fact, emerging science on the
reflex could go a long way toward explaining behavior and mental activity.
Mental activity was going to be cracked by modern biology.
Another source of input arrived in the middle of the 1800s: Charles
Darwin (1809–1882; Figure 1.6A). Darwin, of course, was the Englishman
who traveled around the world taking notes on all the plants and animals
he encountered. In 1859, he described the theory of evolution in his book
On the Origin of Species by Means of Natural Selection. The theory was dis-
cussed and debated in Britain for years. It is so important to us now that
it is almost impossible to overstate it. All life has evolved through natural
selection. There is continuity between humans and animals. Humans and
animals are alike in their struggle for survival.
10  Chapter 1

(A) (B)

Figure 1.6  (A) Charles Darwin, at about age 45, when he was writing On the
Origin of Species by Means of Natural Selection. (B) A drawing from one of
Darwin’s notebooks; he correctly saw evolution as a bush, and not a single line
going from simple organisms to complex ones. (A, image reproduced in Boakes,
1984; B, diagram reproduced in Boakes, 1984.)

The theory of evolution changed the way people see life as well as the
relationships between humans and animals. One of the most remarkable
implications of the idea is that maybe the human mind—the thing that was
supposed to separate humans from animals—had itself evolved. Darwin
addressed the idea in another book entitled The Descent of Man and Selec-
tion in Relation to Sex (1871). But the idea of evolution was mostly pursued
in the late 1800s by a group of English scientists now known as the early
comparative psychologists. Their goal was to trace the evolution of the
mind by studying the mental lives of animals. George Romanes (1848–1894)
went about collecting examples of animal behavior—often stories about
pets reported by their owners—and made inferences about the kinds of
mental abilities they represented or required. Romanes saw himself as a
kind of geologist looking at the strata of a layered fossil record. Much of his
thinking was based on the idea that evolution works as a linear progres-
sion, from the simple to the complex. In fact, as Darwin himself recognized
(Figure 1.6B), evolution does not create a single line of progress but a rich
set of branches on a complex bush.
Another early comparative psychologist, C. Lloyd Morgan (1852–1936;
Figure 1.7A), was much more conservative about the mental abilities he
attributed to animals. For example, whereas Romanes attributed all sorts
of mental abilities to dogs, Morgan emphasized how slowly his own fox
terrier, Tony (Figure 1.7B), had learned to open the latch of a gate. Morgan
Learning Theory   11

(A) (B)

Figure 1.7  (A) C. Lloyd Morgan, the early comparative psychologist. (B)
Morgan’s dog, Tony, a clever animal who nonetheless probably learned things
incrementally. (A, image from http://www-dimat.unipv.it/gnoli/lloyd-morgan2.
jpg; B, image reproduced in Boakes, 1984.)

also tested Tony’s spatial abilities. He threw a long stick over a fence and
sent Tony to retrieve it. There was a gap between pickets in the fence of
about 6 inches. Did Tony understand the spatial relationship and try to fit
the stick through the gap the long way? No; Tony barely got the problem
right after many repetitions over several days. When Morgan repeated the
experiment with a stick with a crook in it, Tony caught the crook repeatedly
as he tried to force it through the gap. Eventually, the crook broke off when
Tony grabbed the stick at that end. A man who happened to be walking
by at that moment commented, “Clever dog that, sir; he knows where the
hitch do lie.” “The remark was the characteristic outcome of two minutes’
chance observation,” Morgan wrote (Morgan, 1894, p. 258). There is a mes-
sage here. Some of the most remarkable things that we see animals do may
actually be built up slowly from fairly laborious and simple processes. A
certain amount of skepticism is appropriate.
Morgan is best known for his “law of parsimony,” better known as
Morgan’s Canon. Put simply, his canon states that an example of behav-
ior should not be explained by a complex, high-level mental process if it
can be explained with a simpler one. Morgan thought that we should be
stingy—parsimonious—with how we explain behavior. This idea is still
with us today. In many ways, Learning Theory is merely a continuation
of the grand tradition started by the early comparative psychologists. For
example, later in this book (see Chapter 8), you will see that a part of
the field devoted to studying animal cognition is expressly interested in
12  Chapter 1

studying the hypothetical mental processes in animals. But we are still


very much interested in arriving at the simplest possible explanation of
complex examples of behavior.
The rise of the conditioning experiment
Morgan and many of his contemporaries became interested in studying
animal learning as a way of understanding the animal mind. Edward L.
Thorndike (1874–1949; Figure 1.8A) was another early comparative psy-
chologist who was interested in exactly this subject. In 1898, Thorndike
published his doctoral dissertation, which was on the subject of the intel-
ligence of cats. To study cat intelligence, Thorndike examined how cats
learned in an apparatus called a puzzle box (Figure 1.8B). The puzzle box
was made of wood and chicken wire. The cat was put in the box and had to
learn to open a latch inside to get out and get at some food that was located
nearby. Thorndike’s cats learned this puzzle fairly well, but they seemed to
learn it slowly and gradually. In addition, Thorndike was unable to convey
the “idea” of how to open the latch to several cats by moving their paws
so as to open the latch. A simple (parsimonious) explanation was possible:
When the cats got fed after getting out of the box, the food strengthened
a simple association between the situation (S) and the latch-opening re-

(A) (B)

Figure 1.8  (A) Edward Thorndike. (B) Two of the


puzzle boxes Thorndike used to study the intelli-
gence of cats. (A, photograph courtesy of National
Library of Medicine; B, photographs from Manu-
scripts and Archives, Yale University Library.)
Learning Theory   13

sponse (R). That is all that was necessary to explain the gradual learning
that Thorndike observed from the puzzle box: the gradual “stamping in”
of an S-R association.
Some years later, Thorndike (1911) saw that his mechanism for explain-
ing learning in cats was also a good way to understand learning in other
animals and people in general. He proposed the law of effect. When a
response is followed by satisfaction (as is provided by food), an S-R con-
nection is strengthened. When it is followed by discomfort (as would be
provided by, say, a mild shock), the S-R association is weakened. Thorndike
went on to have a great career as an educational psychologist. But the
puzzle box experiments and the law of effect launched a very significant
part of modern learning theory that I will return to later in this chapter.
Meanwhile, there was more going on in Russia, the home of Sechenov
and a country steeped in the tradition of studying the biology of the reflex.
Ivan Pavlov (1849–1936; Figure 1.9A), who was a celebrated physiologist,
was also beginning to study learning in animals. Pavlov had begun his distin-
guished career investigating the reflexes of digestion. When we digest food,
we secrete various gastric juices, some enzymes from the pancreas, and a
hormone (insulin) that helps the body’s cells absorb nutrients. We also sali-
vate. In experiments with dogs, Pavlov showed how many of these responses
were coordinated by the nervous system. His studies were so important that
he won the 1904 Nobel Prize in Physiology or Medicine for them. He was
the first physiologist, and the first Russian, to win a Nobel Prize.
Several years before he won the Nobel Prize, though, Pavlov also began
to understand the importance of other kinds of digestive reflexes, namely

(A) (B)

Figure 1.9  (A) Ivan Pavlov. (B) Pavlov’s classical conditioning set-up. Today, classi-
cal conditioning is viewed as an important behavioral phenomenon (it gives neu-
tral cues the power to elicit behavior) and as a method for studying associative
learning—how organisms learn to associate events in the world. (A, photograph
courtesy of National Library of Medicine; B, photograph © Bettman/Corbis.)
14  Chapter 1

learned ones (Figure 1.9B). In 1897, at about the time that Thorndike was
also studying learning in his cats in the United States, Pavlov’s student
Stefan Wolfsohn was studying salivation in dogs. Dogs salivate to food
introduced to the mouth, of course. But after the food has been put in the
mouth a few times, the dog soon begins to salivate merely at the sight
of food. The dog has learned to drool to a signal of food. Pavlov and his
students called this a “psychic reflex.” We now know it as a conditioned
reflex.
In the usual textbook version of Pavlov’s simplest experiment, Pavlov
(or a student) would ring a bell and then present some food to a dog. With
a few bell-food pairings, the dog would begin to drool to the sound of the
bell in anticipation of the food. Pavlov understood that this kind of learn-
ing was an important principle of behavior. It was another kind of reflex
that helped the animal get ready to digest an upcoming meal. But Pavlov
also saw the conditioning experiment as a way to study the animal’s brain.
Psychic, learned reflexes of the brain were the way that someone could
learn to think of the emperor of China every time he lay down on a bed.
Conditioning was a way to study the psychological process of learning,
eventually working it out in physiological terms.
Pavlov arrived at his interest in learning from a route that was dif-
ferent from the one traveled by Thorndike and other early comparative
psychologists. He was interested in the reflex, whereas the comparative
psychologists were interested in evolution. Significantly, both of these
great traditions in biology converged on the fundamental importance
of learning and the conditioning experiment at around the turn of the
20th century.

A Science of Learning and Behavior


John B. Watson
At about this time, psychology in the United States was deeply influenced
by a school of thought known as structuralism. The goal of structuralists
was to analyze the structure of the mind. Their method was introspection:
Researchers looked into their own minds and described what they saw
there. Intuitively, introspection may seem like a good way to study the
mind, but it has extremely serious problems. For one thing, how do we
know that the mind’s contents are all conscious and available to our intro-
spections? Equally important, what are we going to do if two introspectors
disagree? For example, I might claim that all thoughts are connected to
mental images, and you might claim that you have thoughts that are “im-
ageless” (the idea of imageless thought was an actual controversy). Who is
correct? Who is the better introspector? Ultimately, a dispute between two
introspectors would have to be settled by an authority: We would have to
appeal to what other, wiser (and more powerful) introspectors might say.
The problem with introspection—and it is a very big problem—is that the
facts can only be settled by authority. “Facts” must be open to falsification;
Learning Theory   15

if not, there can be no progress beyond the mere opinions of


the powerful people. A science must have results that can
be confirmed or disconfirmed by any investigator.
In 1913, John B. Watson (1878–1958; Figure 1.10), a
professor of comparative psychology at Johns Hopkins
University, published an article that became known as a
behaviorist manifesto. According to Watson’s paper, the
problem with psychology was that it had become too sub-
jective. Debates about imageless thought could go on (and
go nowhere) for hundreds of years. There was also little
information of any practical value that could be taken from
this kind of research. Instead, Watson argued that the field
of psychology should include the study of something ob-
jective and available to all: behavior. Behavior is something
that everyone can see, and any claim about it can be easily
confirmed or falsified.
Figure 1.10  John B. Watson,
Watson’s call for the study of behavior was a challenge
the great American behavior-
to introspection, although his version of behaviorism was ist. Watson’s point was that
crude by modern standards (see below). Watson argued psychological science must
that thought was nothing more than muscle twitches in concern itself with observ-
the throat. He doubted the existence of dreams, perhaps able things, such as behavior.
because he claimed to have never personally experienced (Photograph courtesy of the
them (Bolles, 1979). His point about objectivity is still valid Alan Mason Chesney Medical
and important, however, and his behaviorist movement Archives of The Johns Hop-
changed psychology forever. Today, psychologists can set kins Medical Institutions.)
about studying mental processes, but they typically do not
do so by introspecting; rather, they link their hypothetical
mental processes to behavioral output that can be studied
and measured objectively.
The conditioning experiment was central to Watson’s vision of a sci-
entific psychology (Watson, 1916). One of his most famous experiments
(Watson & Rayner, 1920) was a conditioning experiment with a little boy
named Albert B. (known today as “Little Albert,” in contrast to Freud’s
famous case, “Little Hans”). Albert was an 11-month-old boy whose mother
was a wet nurse in a Baltimore hospital. In their experiment, Watson and
Rayner showed Albert a white rat, and as he reached out to touch it, they
frightened him with a loud noise. After a few trials, Albert began to react
with fright each time the rat was given to him. An emotional response—
fear—had been conditioned in Albert, just like the salivary response had
been conditioned in Pavlov’s dog. Moreover, Albert’s new fear of the rat
generalized to other furry things, such as a rabbit and a fur coat. Watson
and Rayner’s paper poked fun at Freud’s analysis of phobias, which were
based on unresolved Oedipal complexes and the like. They had found
an alternative source of emotional problems. Twenty years later, if Albert
were to check into a psychiatric clinic because of his fear of furry things,
the local psychoanalyst would probably make him improve his relation-
16  Chapter 1

ship with his father. But you and I know that it was simply a conditioning
experience. Unfortunately, Albert was removed from the laboratory before
Watson and Rayner had a chance to decondition his fear. Of course, Albert’s
conditioning experience was not really very traumatic (see Harris, 1979).
But the experiment still has meaning today because we still suspect a role
for conditioning in many emotional disorders that begin with exposure
to panic or trauma (e.g., Bouton, Mineka, & Barlow, 2001; Eysenck, 1979;
Mineka, 1985; Mineka & Zinbarg, 2006). Interestingly, recent historians
have tried to discover what became of Little Albert (e.g., Beck, Levinson,
& Irons, 2009; Powell, Digdon, Harris, & Smithson, 2014). According to the
more recent account, Albert was probably a boy named Albert Barger who
went on to have a fairly normal life—with only a mild dislike for animals,
particularly dogs, according to a niece (Powell et al., 2014).
Watson was a flamboyant character. Unfortunately, his career as a
psychologist ended in scandal when someone discovered that he and
his research assistant, Rosalie Rayner, were having an affair. Watson was
forced to resign from the university, and he eventually joined the J. Walter
Thompson advertising agency, where his interest in empirical work—in
this case, market research—paid off very well. By 1924, he was one of four
vice presidents of the company (Boakes 1984).
Outside of academia, Watson continued to write about psychology. One
of his most famous books, Behaviorism (1924), was full of colorful, impas-
sioned claims. His emphasis on learning is illustrated by the lines from this
book that you have probably seen many times before:
Give me a dozen healthy infants, well-formed, and my own specified
world to bring them up in and I’ll guarantee to take any one at
random and train him to become any type of specialist I might
select—doctor, lawyer, artist, merchant-chief and, yes, even beggar-
man and thief, regardless of his talents, penchants, tendencies,
abilities, vocations, and race of his ancestors. (Watson, 1924, p. 104)
It is a nice sentiment, although perhaps a little overstated; the right en-
vironment can make the world better for anyone, regardless of race or
station of birth. In truth, Watson was not quite as naive as this sentence
suggests. It is always quoted out of context. The very next sentence in
the book is
I am going beyond my facts and I admit it, but so have the advocates
of the contrary and they have been doing it for many thousands of
years. (Watson, 1924, p. 104)
Watson was essentially a campaigner; he was out to change the world.
Although his early ideas have been steadily transformed over the years,
his emphases on behavior and on learning are still a very important part
of psychology today. In fact, his views that psychology should rely on ob-
jective evidence and that it should be useful have inspired many research
psychologists ever since.
Learning Theory   17

(A) (B)

Figure 1.11  (A) B. F. Skinner. (B) A modern “Skinner box.” Although the rat is
free to press the lever to its left as often as it likes, the behavior is still con-
trolled by its consequences (i.e., presentation of the food pellet). (A, photo-
graph courtesy of National Library of Medicine; B, courtesy of Med Associates.)

B. F. Skinner
Watson’s call to behaviorism was followed by a great deal of discussion
about what it actually takes to be scientific about behavior. One of the
most important writers on this subject was B. F. Skinner (1904–1990;
Figure 1.11A), who developed a type of behaviorism that is called radical
behaviorism today. Skinner was actually an English major, not a psychol-
ogy major, and he began thinking about behaviorism after he had finished
college. In his early papers on the subject (e.g., Skinner, 1931, 1935), he
considered our old friend, the reflex. He noted that one meaning of the
reflex is that it describes an empirical relationship between two events:
Given the stimulus, a response is also likely to occur. People in Skinner’s
day also gave the reflex an additional meaning: the physiological reflex
arc. The physiological concept introduced a new set of events between
the stimulus (S) and the response (R); a sensory neuron must respond to
S and then fire a motor neuron that excites muscle and makes R. Talk-
ing about these physiological events does not change the basic empiri-
cal relationship between S and R. Instead, Skinner suggested, it merely
introduces a new set of things that need to be explained. How does the
sensory neuron fire? How does it excite the motor neuron? What physi-
ological events actually make the muscle move? To explain these things,
we must study other sciences, such as physiology and molecular biology.
The physiological reflex introduces a large number of new questions. It
has taken us away from behavior. Psychologists, the radical behaviorist
18  Chapter 1

claims, should therefore stick to the simple empirical correlation between


S and R.
The same argument can be made about mentalistic events we might
want to stick between stimulus and response. For sure, concepts like short-
term memory and attention are interesting, but introducing them merely
introduces new things that have to be explained. Skinner did not deny that
people have nervous systems or that they experience private events they
call cognitions. But at best, these things merely introduce new questions
that need to be answered. At worst, they masquerade as useful explanations
(what does it really mean to say that the child responded because she had
an insight?). The new concepts do not add anything useful, and they do not
change the basic relationship between S and R. There is a very important
message here. Your behavior is not caused by another little person living
inside your head. It is lawfully related to the environment. Skinner’s views
are widely held today by a number of active scientists and practitioners
known as “behavior analysts” (or “Skinnerians”).
Skinner is important to our story for many reasons, and we will return
to him later in this book. One of the most important things that he did was
to contribute a method: He designed the first Skinner box (Figure 1.11B).
(Skinner did not name it that; Clark Hull, who is introduced later, did.) In
the 1920s and 1930s, it was common to study the behavior of rats in mazes.
Mazes are useful and interesting, but using them is labor-intensive because
an experimenter must hover near them until the rat gets from the start to
the goal. Skinner, who was interested in gadgets, eventually closed the rat
up in a small box and gave it a little lever to operate. When the lever was
closed, another gadget delivered a food pellet to a cup attached inside the
box to the wall nearby. Not surprisingly, the rat learned to press the lever
to get the food pellet. If pellets were made contingent on lever-pressing,
the rat would press the lever many times in an hour.
This arrangement is now known as the operant experiment, and it
has become an extremely important tool for research. The rat’s lever-press
response is called an operant because it operates on the environment.
The food pellet, which increases the rate of responding when it is made
a consequence of the response, is called a reinforcer. Nearly everyone is
familiar with the operant experiment—rats in boxes have been in maga-
zine cartoons since at least the 1950s—but there is a fundamental insight
here that is easy to miss. In the operant experiment, the rat is put in the
box for a period in which it can press the lever whenever it wants to. The
situation was different for the cat in Thorndike’s puzzle box. The cat had
only one opportunity at a time to respond. Once out of the puzzle box,
it had to wait for Thorndike to set up another trial. In contrast, the rat in
the Skinner box is free to make the response over and over, as often as it
likes. Or it can curl up and sleep in the corner if it so decides. Pressing the
lever is purely voluntary. In this sense, the operant experiment captures an
important characteristic of behavior in the world outside the laboratory:
Within certain constraints, you and I are free to do more or less what we
Learning Theory   19

Table 1.1  The operant-respondent distinction


Respondent Operant
Controlled by antecedents Controlled by consequences
“Elicited” “Emitted”

want when we want to. Skinner’s insight was that what we call “free” or
“voluntary” behavior, like responding in the Skinner box, is still lawfully
related to the environment. It increases or decreases according to its payoff.
Operant, “voluntary” behavior is controlled by its consequences, and it can
be studied using the Skinner box.
In fact, the operant experiment exposes a new kind of empirical corre-
lation between behavior and environmental events. The operant response
is lawfully related to its consequences. We know this is true because we
can present the food pellet in different ways. For instance, if we present
a pellet every time the rat presses the lever, responding will happen at a
high rate, but if we stop presenting pellets, the lever pressing will stop. The
operant’s correlation with its consequences is somewhat different from the
correlation in the traditional Descartes-and-Pavlov reflex in which the child
withdraws her hand from the fire or the dog salivates to the sound of the
bell. In these cases, the response is elicited by an antecedent stimulus. It
is a response to an event that precedes it; this sort of behavior is a respon-
dent. Unlike operant behavior, which is controlled by its consequences,
respondent behavior is controlled by its antecedents. I have just described
Skinner’s operant-respondent distinction (Table 1.1).
Skinner’s radical behaviorism seeks out empirical correlations between
behavior and environmental events and finds two very broad types. Some
behaviors are controlled by their consequences (operant behaviors), and
others are controlled by their antecedents (respondent behaviors). The
system does not predict that certain types of behavior (barking, drool-
ing, writing, drawing) will be operant or respondent; rather, every new
empirical relationship is open to discovery. We must figure out whether
any particular behavior is controlled by something that precedes it or fol-
lows it. The operant-respondent distinction is extremely useful in the clinic
because it provides a simple way to discover how to control behavior. If a
child who screams in public is brought to a behavior analyst, the analyst
will look at the behavior’s antecedents and also look at its consequences.
Either one can be manipulated. If changing the consequences changes the
behavior, we have an operant. If changing the antecedents changes the
behavior, we have a respondent. Either way, the behavior analyst wins
because the behavior has been brought under control to the benefit (we
hope) of everyone. The basic system is so simple that it can be readily ap-
plied to any new situation (and has been, with very positive effects). The
radical behavioristic approach that Skinner began has done a great deal
of good in the world.
20  Chapter 1

Figure 1.12  Edward C. Tolman was the learning theorist who


developed “operational behaviorism,” which made it accept-
able to explain behavior with unobservable constructs (like
motivation or cognition), provided they are specifically linked
to observable input and output. (Photograph courtesy of Uni-
versity of California, Berkeley.)

Edward C. Tolman
It may surprise you to learn that you do not have to be a
Skinnerian to be a behaviorist. At least one other brand
of behaviorism has been adopted in one form or another
by every field of scientific psychology. Like Skinner’s ap-
proach, this one was developed in the 1920s and 1930s
while people were discussing what it takes to be scientific
about behavior. This version differs from radical behavior-
ism in that it accepts unobservable events in the explanation of behavior,
provided they are used rigorously and carefully. The approach, often called
operational behaviorism, got its start with Edward C. Tolman (1886–1959;
Figure 1.12), an early behaviorist who spent his career at the University
of California, Berkeley.
Tolman described his perspective in several articles, although one of
the most important was a paper he published in 1938. His idea can be il-
lustrated with an example by Miller (1959). Figure 1.13A shows a series of
empirical correlations that one can easily gather in the laboratory. A rat’s
drinking increases if it is deprived of water, fed dry food, or injected with
a salt solution. So does pressing a lever for water in a Skinner box, and so
on. The list of stimuli and responses could be expanded easily, of course.
A nest of arrows would accumulate, getting bigger and more complicated
with each new discovery.
The relationship can be simplified by using what Tolman (1938) called
an intervening variable (another term we often use is theoretical con-
struct, e.g., Bolles, 1975). The idea is illustrated by adding “thirst” between
all the stimuli and responses as shown in Figure 1.13B. In this scheme, the
things that we manipulate (input, at left) affect thirst, and thirst in turn
produces the behavioral effects described (output, at right). The thing called
“thirst” is not directly observable; it is inferred from its effects on behavior.
It is nonetheless reasonable to use the term if we are careful to link it to
several effects it has on behavior and to the operations that are supposed
to increase or decrease it. This point is important. Invoking “thirst” would
have no use in explaining behavior if we had no understanding of (a) how
it comes about and (b) how it affects behavior.
In principle, the scheme presented in Figure 1.13B can be tested and
proven wrong. For example, if a given number of hours of water depriva-
tion leads to more lever pressing than a given amount of consumed dry
food, it should also lead to more water consumption and acceptance of
Learning Theory   21

(A)
Hours of water
Volume of
deprivation
water consumed

Dry food Rate of lever


pressing for water

Saline injection Quinine tolerated

(B)
Hours of water Volume of
deprivation water consumed

Dry food Thirst Rate of lever


pressing for water

Saline injection Quinine tolerated

Figure 1.13  A theoretical construct like “thirst” is not directly observable, but it
simplifies the explanation of behavior. (A) A set of empirical relations between ex-
perimental manipulations (left) and behaviors (right). (B) A simpler set of relations
with “thirst.” In this diagram, notice that thirst is linked to the empirical world on
both the input side and the output side. The linkages are crucial because they
make the system falsifiable and therefore scientific. (After Miller, 1959.)

more quinine. If these predictions were disconfirmed, we would have to


consider other schemes. The use of well-specified intervening variables is
scientific because the structure can be tested and falsified.
Using intervening variables also has advantages. For one thing, it sim-
plifies the picture by reducing the messy number of empirical S-R corre-
lations. It also suggests new hypotheses and relationships that stimulate
new research. One of Tolman’s points was that we always use theoretical
constructs, and so do other sciences (gravity and atomic structure, for ex-
ample, are important in physics even though they are inferred rather than
observed directly). We just have to use them scientifically by making sure
that they are “anchored” in the empirical world of stimuli and responses.
Nearly every part of psychology has adopted the operational behavior-
istic framework. Thus, a social psychologist may study stereotypes and
prejudice, but these terms are really only useful if they are operationally
defined. Cognitive psychologists play the same game. Working memory,
semantic memory, and attention are not directly observable, but they are
useful when they are linked to behavior.
Another pioneer in the use of intervening variables was Clark L. Hull
(1884–1952). Hull (1943, 1952) presented a highly systematic theory of
22  Chapter 1

behavior that had a large impact in the 1940s and 1950s. Today, although
many of Hull’s ideas have been disconfirmed, his theory still stands as a
brilliant example of how theoretical constructs can be used. The theory
described a number of theoretical constructs that were carefully anchored
on both the antecedent side and the consequent side. Their interactions
were also specified. For example, two of Hull’s hypothetical constructs
were Drive and Habit. Drive was the motivation caused by biological
need (hunger, thirst, etc.); it was mostly influenced by being deprived of
something important. Habit was learning; it was mostly influenced by
the number of times the behavior had been reinforced. Based on empiri-
cal research (Perin, 1942; Williams, 1938), Hull concluded that Drive and
Habit multiplied to influence performance. As we will see in later chap-
ters, Hull’s theory was different from Tolman’s approach in important
ways, but it was similar in its emphasis on intervening variables and the
relations between motivation and learning. It is perfectly acceptable to
build a science of behavior using unobservable things in our explanations.
Tolman’s perspective is with us today, and although we consider a great
deal of research from the radical behaviorist perspective, this approach
is generally accepted in this book.
Computer and brain metaphors
After World War II, there were further changes in how psychologists began
to think about behavior. During the war, many psychologists helped design
military equipment with an eye toward making it easy for people to oper-
ate. They often worked with engineers who were interested in information
technology, such as how information is transmitted over telephone lines.
Psychologists began to think of people as processors of information. With
the rise of the computer in the 1950s and 1960s, this perspective became
more deeply ingrained. The idea was that people might operate like com-
puters do, executing programs as they handle information. By the 1960s, it
was becoming clear that computers could be programmed to solve prob-
lems and do other impressive tasks that humans do (e.g., Newell & Simon,
1961). The computer analogy was off and running in what we now know
as cognitive psychology.
How people think about behavior has always been influenced by the
technology of the times; remember how important the spirit of mechanism
was to Descartes and the development of the reflex concept. The computer
provides an extremely rich and compelling metaphor. The basic computer
receives input (e.g., from a keyboard or from another computer on the Web)
and eventually turns this information into output. It first transforms the
input into a code of simple on-off electrical charges. It then processes this
kind of information, performing calculations on it, transferring it to the
screen, and so forth, all while executing commands. Some of the informa-
tion is stored in a temporary memory that is lost forever if the computer
is switched off before the information is stored on the hard drive or a
memory stick. If we looked inside the machine, we would find that all
Learning Theory   23

Short-term store
Sensory registers (STS)
Temporary
Visual
working memory Long-term store
Environmental Auditory (LTS)
input Control processes: Permanent
Rehearsal memory store
Coding
Haptic
Decisions
Retrieval strategies

Figure 1.14  The “standard model” of cogni-


tion views cognition as a computer-like infor-
mation processing sytem, beginning with en-
vironmental input and terminating in response
Response output
output. (After Atkinson and Shiffrin, 1971).

the work is being done by a central processor that manipulates the codes
and symbols in a sequence. It is all very logical. The computer does much
more than simply connect things, like a stimulus and a response, the way
an old-fashioned telephone switchboard does.
The computer is the inspiration behind what is known as the informa-
tion processing approach in cognitive psychology. Figure 1.14 presents
an example of the “standard model” of cognition (Simon & Kaplan, 1989);
the figure is actually based on a model proposed by Atkinson and Shiffrin
(1971). The system emphasizes several theoretical constructs that intervene
between stimulus (environmental input) and response. Each is devoted
to processing information. The system receives sensory input from the
external world, and this input is processed into different forms at dif-
ferent points in the sequence. It enters the sensory registers, which very
briefly store it as raw visual or auditory input. If attention is paid to it, the
information is transferred into short-term memory, where, if it is rehearsed
enough, it eventually gets put into long-term memory. The influence of the
computer analogy is clear. This approach to human information processing
and memory has dominated the field for many years.
The computer metaphor is still important in psychology, but it cannot
really tell the whole story. One problem is that it has surprisingly little to
say about learning; instead, it mainly tells us about symbol manipulation.
Another problem is that although the standard model makes sense as a
series of hypothetical constructs, it is not likely that the brain is really or-
ganized this way. Neurons in the brain operate fairly slowly, somewhere
in the range of milliseconds. In contrast, computer components operate in
the range of nanoseconds, about a million times faster (Rumelhart, 1989).
There is an actual limit to the number of steps that the brain can perform
in sequence in a task that takes a person a second or so to complete. This
sort of calculation suggests that the brain must perform many operations
24  Chapter 1

Bark

Ears Meow

Floppy “Dog”
tongue

Arched “Cat”
back

Tail White fur

Black fur

Figure 1.15  The “connectionist” point of view is that cognition results from
a brain-like set of nodes (or units) and their interconnections. The diagram il-
lustrates the concepts of dog and cat. Each concept is a set of features (floppy
tongue, ears, bark) that are associated with many other features. When a
new dog appears, at least some of the features become activated and further
activate other features, depending on the strength of the interconnections.
Concept learning requires the association of many features. (After McClelland
& Rumelhart, 1985.)

in parallel—it must do more than one thing at a time—rather than in se-


quence, the way the typical computer operates.
Therefore, a somewhat different perspective began to emerge in the
1980s. It is often called connectionism, although it is also referred to as
parallel distributed processing or neural networks (e.g., Rumelhart &
McClelland, 1986a, 1986b). The perspective represents a shift away from
the computer metaphor and back to a brain metaphor. The basic idea is that
cognition can be understood as a network of connections between units
that look a little bit like neurons. Figure 1.15 illustrates a simple network
with just a few of these units, or “nodes.” Each node is connected to each
of the other nodes. If one node is activated, the activation travels to the
connected nodes depending on the strength of the individual connections.
You can imagine a large number of nodes being activated and their activa-
tion being transferred to other nodes through a multitude of connections
that all get excited in parallel.
As an example, a network like the one in Figure 1.15 can recognize
dogs and cats and discriminate them from other objects, such as bagels
(e.g., McClelland & Rumelhart, 1985). Here is how it works. Each of the
nodes is activated when a particular item is present in the environment.
Learning Theory   25

In our example, each node may respond to a particular feature of a dog.


One node responds to ears, another node to a tongue, another node to a
tail, and so forth. These dog features are strongly connected, so they tend
to activate one another when they are present. The nest of associations will
also activate to a large extent when some, but not all, of the features are
present. In this way, the network will respond “dog” with even a partial
input. But it will not respond “cat” unless cat features are input. There is no
reference to long-term memory, short-term memory, or template matching.
The memory of “dog” is represented in the multiple connections.
Networks like the one shown in Figure 1.15 have a number of advan-
tages. For one thing, it is easy to see how they learn. When the network is
first exposed to different dogs, it associates the various features that occur
together on each trial. Each of the corresponding connections is increased
a little when features occur together and is decreased when they do not. (In
fact, connection strengths can be either positive or negative, with the result
that negative connections “inhibit” the connected nodes and vice versa.)
After the connections have been learned, the net can respond to new dogs
and discriminate them from cats, even if the new dogs or cats do not have
all the crucial features or even if some of the nodes or the connections were
to disappear or die. The connectionist approach has so many advantages
that some people assumed that it was a new “paradigm” that would replace
the information processing approach in cognitive psychology. In truth,
it did not. But the connectionist approach is now widely represented in
cognitive science. Clearly, it is consistent with many of the earliest ideas
(reflex, associationism, parsimony) that led to Learning Theory.
Human learning and animal learning
People sometimes get the impression that the computer metaphor (the
information processing approach) and, to some extent, the brain metaphor
(the connectionist approach) are mostly relevant to the study of human,
but not animal, learning. They sometimes guess that the information pro-
cessing approach replaced the kind of Learning Theory that was prac-
ticed by Pavlov, Thorndike, Skinner, Tolman, and Hull. They are incorrect.
The truth is that the computer and brain metaphors are also important in
Learning Theory (which has advanced considerably since the days of Skin-
ner, Tolman, and Hull) and that what we know about human and animal
learning has shown that they are really rather similar. Thanks to some
important discoveries made in animal learning in the late 1960s (described
in Chapters 3 and 4), animal learning researchers began to recognize that
the information processing approach could help us understand classical
conditioning (e.g., Wagner, 1976, 1978). It became routine to discuss atten-
tion, rehearsal, memory storage, and memory retrieval in explaining even
the simple kind of learning shown by Pavlov’s dogs—provided the terms
were anchored carefully. Conditioned stimuli are now regarded as cues that
retrieve memories of unconditioned stimuli; perhaps classical conditioning
laws govern retrieval cues (e.g., Bouton, 1994a). Starting in the 1970s (e.g.,
26  Chapter 1

Hulse, Fowler, & Honig, 1978), researchers also began to study tradition-
ally “cognitive” topics in animals (such as short-term memory, memory
codes, spatial memory, and timing) using sophisticated versions of the basic
methods used in operant conditioning experiments (see Chapter 8). At the
same time, animal learning researchers never lost the idea that complex
behaviors (and now complex cognitions; see Figure 1.15) can be built out
of simple associations or connections, the end product of conditioning. The
conceptual separation between human and animal learning and memory
may be more apparent than real. As the influential cognitive psychologist
John R. Anderson (1995) has noted, “Much current research on animal
learning has a strong cognitive [information processing] orientation, and
there has been a resurgence of more behavioristic learning theories in re-
search on human memory” (p. 4).
What, then, are the differences between the research literatures on
human and animal learning? First, animal learning has kept its original
interest in behavior. We may study topics like short-term memory, but the
ultimate goal is usually to understand how it makes sense of behavior;
this book adopts the view that our knowledge of behavior and learning
would be incomplete without them. Second, animal learning researchers
are often still rather stingy about the number of theoretical constructs that
they use; following Morgan, the goal is still a parsimonious explanation.
Third, research in animal learning typically asks questions about how we
process, cope with, and remember things linked to motivationally signifi-
cant events (food, drugs, painful events, illness, etc.). In contrast, human
learning research is often concerned with memory for material that is less
emotionally charged and biologically significant, like word lists, sentences,
and pictures. Perhaps because of this focus, topics covered in animal learn-
ing often seem more obviously connected to adaptation and evolution;
animal learning is fundamentally viewed as a way for organisms to adapt
to their environments (see Chapter 2).
The most obvious difference between human and animal learning and
memory, however, is the species that are studied. Earlier parts of this chap-
ter reviewed how the interest in learning in animals originally came about,
but is there still a point to studying learning in animals? The answer is yes.
Learning can be studied in animals because most psychologists believe
that there are fundamental, general processes that are represented in many
species. (We will address this issue in many places in the book, especially
in Chapter 6.) This approach also offers simplicity. For example, an under-
standing of learning in a rat or pigeon is not complicated by the subject
trying to figure out what the experimenter is trying to test and behaving
accordingly (the so-called demand characteristics of human experiments).
In addition, the prior knowledge and experience of animal subjects, as well
as their genetic backgrounds, can be controlled precisely. As Tolman (1945)
once quipped, “Let it be noted that rats live in cages: they do not go on
binges the night before one has planned an experiment” (p. 166). Animal
experiments offer simplicity and rigorous experimental control.
Learning Theory   27

In animals, one can also study the effects of certain events or proce-
dures that are difficult to study in humans. For instance, we can study
how unpleasant emotions, like fear, are learned or how strong hunger
and other forms of motivation affect behavior and learning. One can also
study the physiological bases of learning and memory in experiments on
animals, and there is excellent work in behavioral neuroscience that inves-
tigates the brain mechanisms behind many of the fundamental processes
of animal learning (e.g., Balleine, 2005; Balleine & O’Doherty, 2010; Fan-
selow & Poulos, 2005; Freeman & Stenimetz, 2011; LeDoux, 2015; Schultz,
2006; Thompson & Steinmetz, 2009). Modern studies of neuroimaging
in humans can give us information about how brain activity correlates
with behavior. But the causal connections between brain processes and
behavioral output can be difficult to study in humans. The behavioral
study of learning in animals provides an important bridge between the
neurosciences and psychology.
That is not to say that anything goes. Research with animals (like re-
search with humans) is regulated by law and by rigorous ethical principles.
Since the mid-1980s, animal research projects conducted at every U.S. col-
lege, university, and research institution have been reviewed by committees
made up of scientists along with people from the lay community. The stress
connected with certain procedures—such as fear and hunger—is kept to
the minimum level that is necessary to study the problem. For example,
electric shock applied to the floor of a cage is often used to study fear in
rats. The shocks used are as weak as possible; they can cause a mild “fear”
in the rat, but they typically cause only a mild tingling to the human hand.
In a similar way, the hunger used in animal learning research may be no
more intense than what animals may experience in the wild (e.g., Poling,
Nickel, & Alling, 1990). There have been unmistakable gains due to ani-
mal research (e.g., Domjan & Purdy, 1995; Miller, 1985), and many will be
featured in this book.
To summarize, the study of learning in animals is a method that allows
scientists to study processes involved in learning and memory across spe-
cies. Human learning and animal learning are complementary approaches
to the same problem. The effect of the historical ideas traced in earlier
parts of this chapter—the reflex, associationism, and the evolution of the
mind—are clearly visible in modern animal learning theory and in cogni-
tive psychology.

Tools for Analyzing Learning and Behavior


Learning theory provides a way to think about almost any example of be-
havior. It provides a kind of lens through which behavior can be viewed.
The lens is the modern legacy of all the early theorists we have reviewed
in this chapter. It starts, first, by recognizing the role of two fundamental
forms of learning in nearly any situation, and second, by recognizing that
these forms of learning interact. Later chapters in this book will break
28  Chapter 1

the framework down into more detail and study its component parts.
To appreciate what is coming, it is worth noting how the basic tools will
work together.
Learning about stimuli and about behavior
Soon after Watson’s time, researchers began to settle on two basic “para-
digms” of learning. One of them—the one investigated by Pavlov—is now
known as classical conditioning. In modern times, there are two major
reasons that classical conditioning is considered important. First, it is fun-
damental to adaptation because it is the way that animals learn to antici-
pate and deal with upcoming biologically significant events. The sound
of Pavlov’s bell did not just elicit drooling; it elicited a whole system of
responses and behaviors that are organized to get the dog’s system ready
for food. (This idea is considered further in Chapters 2 and 5.) Second,
notice that after conditioning has occurred, the dog behaves as if it has
learned to associate the sound of the bell with food. Classical condition-
ing represents a situation in which the animal learns to associate stimuli
in its environment. Classical conditioning is therefore used as a method
for investigating how animals learn about stimuli. For this reason, I will
sometimes call it stimulus learning.
The other basic paradigm is the result of Thorndike’s and Skinner’s
research. It is now called instrumental conditioning or operant condi-
tioning. This kind of learning is also very important in behavior because,
as we have already seen, it allows animals to do things that lead to good
consequences. It also lets them learn to stop doing things that lead to
bad consequences. Notice that the rat pressing the lever in the Skinner
box to get food is behaving as if it has learned the connection between
the act and the outcome. In modern terms, operant conditioning gives
us a method for investigating how animals learn about the relations be-
tween behaviors and their consequences. I will therefore sometimes call
it response learning.
Classical and operant conditioning have usually been studied separate-
ly mainly for analytic reasons; it is easiest to investigate them by studying
them apart from one another. This practice has led psychologists to use dif-
ferent terms and vocabularies to describe them, however, and the different
terms help create an unnecessary and artificial boundary between them. For
now, it is important to understand that in the natural world, classical and
operant conditioning are always working together. To help emphasize this
point, I will use some terms that can describe either of them and, indeed,
any situation in which an organism is learning and behaving (Colwill &
Rescorla, 1986; see also Bolles, 1972a; Jenkins, 1977). The terms may seem
a little abstract at first, but their abstractness is exactly the property that
makes it possible to use them in any situation.
Here they are: In classical conditioning, the animal behaves as if it learns
to associate a stimulus (S) with a biologically significant outcome (O). In
Pavlov’s experiment, the sound of the bell was S, and the food was O. In
Learning Theory   29

principle, O can be anything significant—like food, shock, illness, a drug,


or a copulation—depending on the situation or experiment. Similarly, S
might describe the sound of a bell, a light, the sight of a rat, a friend, or a
room, again depending on the situation or the experiment. I will often refer
to classical conditioning as S-O learning because the organism behaves as
if it learns to associate an S and an O (see Chapter 3).
In many ways, operant conditioning is a similar affair. In this case, the
animal behaves as if it learns to associate a behavior or response (R) with a
biologically significant outcome (O). In Skinner’s experiment, pressing the
lever was R, and the food pellet was O. In Thorndike’s puzzle box experi-
ments, moving the latch was R, and the tasty fish was O. I will often refer
to operant conditioning as R-O learning because the organism behaves as
if it has associated an R with an O.
At a simple level, classical and operant conditioning boil down to S-O
and R-O learning. Both kinds of learning can occur whenever an organism
encounters a biologically significant event or outcome (O).
One difference between classical and operant conditioning is what the
animal appears to learn: In classical conditioning, the animal mainly learns
about stimuli; in operant conditioning, it mainly learns about behavior (I
am simplifying here). Classical and operant conditioning also differ in the
procedures used to study them. In a classical conditioning experiment,
the experimenter controls when the crucial events—S and O—occur. The
subject responds, but the response has no effect on whether O is delivered
or not. In contrast, in the operant or instrumental situation, the subject’s
behavior is essential—“instrumental”—for producing O. A difference be-
tween the two learning paradigms, then, is whether the subject’s behavior
controls the delivery of O.
Again, the distinction is largely academic. Outside the laboratory, stimu-
lus learning and response learning are almost inseparable. Every time an
O is associated with behavior, it can also be associated with stimuli that
are also present and vice versa. Behavior is a combined result of stimulus
learning and response learning. Let us consider a few examples.
Crows foraging at the beach
Reto Zach (1978, 1979) watched crows feeding on whelks (a type of marine
snail) (Figure 1.16) on a beach in British Columbia. The birds behaved in a
way that was remarkable and efficient. At low tide, they poked around near
the water among the whelks that tended to collect there. After lifting and
then rejecting one or two whelks, apparently on the basis of their weight,
each bird finally selected a large one and flew off with it to a favorite rocky
location near cliffs at the back of the beach. Once there, the crow flew up
to a height of a little more than 5 meters and dropped the whelk on the
rocks below. After a few attempts, the whelk shell broke, and the bird ate
the tasty animal that was living inside.
There are several aspects of the crow’s foraging behavior that are in-
teresting. The birds tended to confine their foraging to low tide, when the
30  Chapter 1

(A) (B)

8 Perch
1
5

Cliff
7
3
4 6

Water 2

Beach
Dropping site

Figure 1.16  (A) Multiple whelks, or marine snails. (B) Typical behaviors of crows
finding and dropping whelks. (After Zach, 1978, 1979.)

whelks were most plentiful. They also took only the largest whelks, which
Q: Should there be an arrowhead on the line identified as “1”?
Zach
Also clarify that numbers “8”showed were
and “1” are in thethe ones
right whose
location. shells seem
“8” doesn’t broketomost easily
connect when
with line “7”. dropped.
By selecting the largest whelks, the birds did not waste energy trying to
break whelk shells that would not break easily. The birds also had different
favorite dropping locations. And before dropping their whelks, each bird
flew to an optimal height that both minimized effort and maximized the
probability that the whelk shell would break upon impact. Thus, the birds
found and broke whelk shells in the most efficient way possible.
The behavior almost seems miraculous. It appears quite intelligent and
complex. Learning Theory gives us a fruitful way to think about how the
birds do it. The crow fundamentally learns about various stimuli at the
beach, and about its behavior, through stimulus and response learning.
The first thing to notice is that there are several behaviors involved: The
bird approaches the water, lifts and rejects some whelks, and after accept-
ing one, flies to a favorite site and drops the whelk onto the rocks from
a specific height. Each of these behaviors is an operant that is reinforced
by the payoff of gaining access to the animal inside. At the same time,
the bird has also probably learned about stimuli in its environment. It
may have learned that low tide signals whelks on the beach. It has also
learned that whelk shells hide tasty animals; the whelk shell, along with
the beach itself, are signals for food. Whelks of a large size and weight are
the most likely to lead to reinforcement, yet another example of stimulus
learning. And as the crow flies off to its favorite dropping site, it uses

Bouton Learning and Behavior, 2e


Learning Theory   31

cues to get there. The dropping site itself is associated with a reward, and
that is why the bird may repeatedly visit the same site. We can begin to
understand the entire sequence by first breaking it into components of
stimulus learning and response learning.
Human eating and overeating
It is not inappropriate to note that human appetite and eating are also
commonly discussed in similar terms. Whenever we eat a tasty food or
a tasty meal, we have an opportunity to associate the food (O) with both
the stimuli that accompany it (Ss) as well as the actions (Rs) that produce
it. Stimulus learning and response learning are always operating when
we eat or look for food. Through S-O learning, we learn to associate food
with stimuli like restaurant logos (think “Golden Arches”), food odors
(think recently baked bread or sticky buns), and food packaging (think of
the brightly colored wrapper that contains your favorite candy bar). This
sort of learning has predictable and interesting effects on our appetites
and eating behavior; loosely speaking, exposure to stimuli associated with
food excites and stimulates our appetite (see Chapters 2 and 9 for more
discussion). Approaching a restaurant, purchasing food, or removing a
burger from its packaging are all behaviors (Rs) that have been associated
with food (O); we perform them thanks to R-O learning. Understanding
stimulus and response learning can contribute greatly to our understanding
of eating and appetite (e.g., Kessler, 2009). We will therefore have much
more to say about these ideas in later chapters.
Kids at play
The general framework is useful for understanding behavior in virtually
any setting. For example, when small children are outside playing, all their
activities (playing with toys, drawing or painting, interacting with other
children) are in principle governed by stimulus learning and response learn-
ing. The child’s choice of which activity to engage in is influenced by each
behavior’s reinforcing value or payoff (see Chapter 7). How small children
actually learn to do the things they do is not trivial. A child who has learned
to blow bubbles could have first associated the bubble container with fun
and excitement (stimulus learning), which alone might generate interest
and pleasure if a bubble bottle were to appear at a birthday party. And, of
course, further learning to dip the wand in the soap, to bring the wand to
the lips, and to actually blow are all examples of operant behaviors that are
learned and perfected through response learning. The pleasure provided by
this (or any other) activity will be further associated with the situation (e.g.,
a friend’s house or the arrival of Uncle Bob)—more stimulus learning that
will further influence behavior. Response learning and stimulus learning are
always happening in combination and influencing behavior everywhere.
Learning about stimuli and responses starts very early in life (e.g., Lip-
sitt, 1990). Even very young babies quickly learn to suck on pacifiers that
deliver a sweet solution. They also learn that sucking in the presence of
32  Chapter 1

(A) (B)

Figure 1.17  Children learning in Carolyn Rovee-Collier’s experiments. (A) By


kicking its foot, the very young baby moves the mobile to make it jiggle. (B) By
pressing the lever, the somewhat older baby (6 months and older) makes the
train move along its track. (Photographs courtesy of C. Rovee-Collier.)

certain pictures leads to the sweet solution, whereas other pictures do not
(e.g., Lipsitt, 1990). They suck mostly when the right pictures are shown.
Sucking is an operant behavior, controlled by its consequences. The pic-
tures also signal when the relation is in force. But the pacifier itself and the
pictures are also directly associated with the sweet solution. Presumably,
through stimulus learning, they generate excitement—and possibly also
some of the sucking.
The late Carolyn Rovee-Collier studied how babies 2 to 6 months of
age learn and remember (e.g., Rovee-Collier, 1987, 1999). She and her stu-
dents (who continue to do this work) visited babies’ homes and conducted
experiments while the baby is in its crib. They mount a bouncing mobile
on the crib directly above the baby. While the baby is on its back, the
experimenter loops one end of a ribbon around the baby’s ankle and the
other end around the mobile (Figure 1.17A). When the baby kicks, the
mobile jiggles— and the babies are delighted with the movement. Babies
quickly learn to kick their feet to shake the mobile, and they remember it
when they were tested later. This scenario, like so many others, involves
both response learning and stimulus learning. The kicking response is an
operant associated with movement of the mobile; but the mobile itself—as
well as the bumper and other stimuli around the walls of the crib—is also
associated with the movement and the activity. Babies forget over time, of
course, but they forget more slowly as they get older. Older babies (6–18
months old) can be tested with a different arrangement in which press-
ing a lever causes a miniature toy train to move around a track (Figure
1.17B). There is little change in how learning and remembering work in
these situations, and there is no obvious change in performance when the
Learning Theory   33

baby learns to talk. Rovee-Collier argued that “infants’ memory process-


ing does not fundamentally differ from that of older children and adults”
(Rovee-Collier, 1999, p. 80).
People using drugs
Stimulus learning and response learning are also extremely useful in un-
derstanding drug-taking in humans (e.g., Stolerman, 1992). There is no
question that drugs are biologically significant events (Os) and that they
can be powerful reinforcers. Drugs may reinforce drug-taking behavior
because they cause euphoria or perhaps because they reduce anxiety. In
either case, people can thus learn to acquire drugs and take them through
simple response learning. (As we saw at the start of the chapter, drug-
taking behavior can actually be studied in rats, which will learn to press
levers in a Skinner box if doing so leads to a drug injection.) Notice, though,
that every time a drug is taken, the person can also associate it with the
stimuli that are present at the same time; that is, the drug also provides
an O in a classical conditioning trial. Drug users associate both behaviors
and stimuli with drugs. Both types of learning have major implications
for how we think about drug abuse (e.g., Bevins & Bardo, 2004; Everitt &
Robbins, 2005; Milton & Everitt, 2012; Saunders & Robinson, 2013; Siegel,
1989; Stewart, de Wit, & Eikelboom, 1984; Stolerman, 1992). For example,
one result of being presented with a cue associated with a drug is that it
may elicit a withdrawal-like effect, or a general expectancy of the effects
of getting high, that further motivates the operant behavior that produces
it. As we will see in Chapter 9, one important effect of stimulus learning is
that it is supposed to motivate operant behavior.
I will return to the question of drug-taking in several subsequent chap-
ters. Given that drugs can be seen as reinforcers, it would be useful to know
more about how reinforcers work. You may want to know how they main-
tain behavior, how different ways of scheduling reinforcers might influence
behavior, and under what conditions does the pairing of a behavior and an
Outcome produce learning and under what conditions it does not. Simi-
larly, it would be useful to know how S-O learning works, what conditions
lead to it, and what conditions discourage it. You may want to learn more
about what kinds of things are actually learned in S-O learning: Do we
respond to drug signals with excitement or other physiological responses?
You may also want to know how knowledge about stimuli (S-O learning)
and knowledge about behavior (R-O learning) influence each other. And,
of course, if these kinds of learning are involved in drug abuse, you may
want to know how to get rid of them. Learning theory gives us a handle
on all these questions. All that (and more) is what this book is all about.
Relations between S, R, and O
Figure 1.18A puts all the above examples together into a single description.
It illustrates the idea that any situation involving biologically significant
events or outcomes (O) involves an opportunity to associate O with both
34  Chapter 1

(A) S O (B) Low tide (S) Snack (O)

R Forage (R)

(C) “Golden Arches”, Food (O) (D) Crib bumper (S) Jiggling
burger wrapper (S) mobile (O)

Approach, Kick (R)


unwrap, eat (R)

(E) Room, friend, Drug


odor (S) effect (O)

Drug taking (R)

Figure 1.18  (A) S, R, and O describe any example of behavior, and therefore
help us to understand them. (B) Foraging crows learn to forage (R) in the pres-
ence of low tide (S) to get a tasty whelk snack (O). (C) A person approaches a
restaurant and handles food packaging (R) in the presence of a large number of
food cues (S) in order to eat a tasty burger (O). (D) A baby learns to kick (R) in
the presence of a particular crib bumper (S) in order to jiggle the mobile (O).
(E) A person takes a drug (R) in the presence of many stimuli, like a room, a
friend, or an odor (S) in order to receive the drug’s effect (O).

behavior (R) and with stimuli (S) present in the environment. We study
S-O learning, as well as its effects on behavior, when we study classical
conditioning. We study R-O learning, and its own effects, when we study
operant conditioning. By now, it should be easy to relate almost any ex-
ample of behavior to this general framework. The S, R, and O scheme can
apply to any situation as easily as it applies to foraging crows, people at
the food court, bubble-blowing children, and drug abusers.
Figure 1.18A also illustrates two other links besides R-O and S-O. In
one, we learn that stimuli actually signal an association between a behav-
ior and a significant event. That is what S’s connection with the R-O as-
sociation portrays. For the foraging crow (Figure 1.18B), low tide signals
that “foraging now pays” in addition to directly signaling the presence of
food. For the baby in the crib (Figure 1.18D), the crib bumper signals that
kicking will now jiggle the mobile. For the drug user (Figure 1.18E), a
street corner might signal that “drug buying will bring drugs” in addition
to being directly associated with drugs. This kind of learning also has an
Learning Theory   35

important influence on behavior. Here, the stimulus “sets the occasion”


for the operant behavior (Skinner, 1938).
In the final link, we may learn a direct association between stimuli and
response (S-R) so that the stimulus can elicit the response reflexively. Many
psychologists think that this kind of association becomes important after
a great deal of repetition and practice; as a behavior becomes “habitual,”
the situation may come to elicit it automatically. The behavior becomes
“automatized” (e.g., Schneider & Shiffrin, 1977). Because of the early influ-
ence of the reflex concept, psychologists for the first half of the 20th century
often assumed that S-R connections were the major basis of all learning.
Today, however, psychologists are open to a possible role for every one of
the simple connections presented in Figure 1.18 (e.g., Rescorla, 1991; see
also Chapter 10).
The scheme shown in Figure 1.18 can be applied to almost any example
of behavior you may encounter. Try thinking about how it helps you under-
stand your own behavior and the behavior of your friends. The framework
can be surprisingly useful, and it is worth the exercise. This book is orga-
nized so that the different types of learning represented here are discussed
in a sequence. We will begin by talking about the similarities between S-O
and R-O learning. Then we will ask how S-O learning functions and how
it works. There will be some surprises along the way because Pavlovian
conditioning is not what most people think it is (Rescorla, 1988b). We will
eventually turn to R-O learning, and the relationships between S-O and
R-O learning, as we consider some interesting topics in animal cognition
and motivation. Ultimately, we will put the pieces back together and think
about how the whole thing operates. The journey is not short, but it is per-
haps the most important part of a story that started with Descartes, Locke,
Darwin, and the early comparative psychologists.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. In the 1600s, philosophers began to wonder whether humans are ma-
chines that operate according to scientific laws. Descartes’s distinction
between mind and body held that only the body is controlled by such
laws. Specifically, the human body (and all animal behavior) is controlled
by reflex action.
2. Later philosophers suggested that even the mind is governed by scientific
laws. The British Empiricists (e.g., Locke and Hume) argued that the mind
is a tabula rasa at first, with knowledge being written on it by experience.
Complex ideas are built up from simple associations, following several
laws of association. Rationalists (e.g., Kant) differed from the Empiricists
36  Chapter 1

in supposing that the mind is not initially empty, but starts with certain a
priori assumptions with which it actively molds experience.
3. Meanwhile, in the 1800s, biologists were beginning to learn more about
the physiology of reflexes and reflex action. According to thinkers like
Sechenov, even human thoughts could be understood as reflexes of the
brain. By the turn of the 20th century, all this set the stage for Pavlov’s
pioneering work on learned “psychic” reflexes. Processes of major sig-
nificance could now be studied with conditioning experiments.
4. In the mid-1800s, Darwin’s theory of evolution emphasized that humans
and animals are alike. The early comparative psychologists began to
study one of its most astonishing implications: that even the human
mind has evolved. To do so, they studied the behavior of animals
in an attempt to identify the cognitive processes that they possess.
Ultimately, parsimonious principles won out. Thorndike’s experiments
on cat intelligence led to the conclusion that learning could generally
be understood by knowing how reinforcers stamp in S-R associations.
Thorndike’s work also encouraged interest in conditioning experiments.
5. Watson rescued psychology from the morass of introspection by pro-
posing behavior as its subject matter. The main advantage of studying
behavior is that everyone can see it; the facts therefore do not merely
depend on what the most powerful people believe or introspect. Wat-
son was also empiricistic and saw a central role for learning. Like others
before and after him, he also saw the reflex as an abstract thing so that
learned reflexes in animal conditioning experiments were directly rel-
evant to the reflexes he saw in humans in the real world.
6. At least two forms of behaviorism emerged after Watson. Skinner’s
radical behaviorism set out to study the empirical relationships between
observable events, such as stimuli and responses. This approach identi-
fied two types of behavior: respondents, which are behaviors elicited by
events that precede them, and operants, which are behaviors that are
controlled by their consequences.
7. In contrast, Tolman’s operational behaviorism uses unobservable theo-
retical constructs (or “intervening variables”) to help explain behavior.
These constructs are useful provided they are carefully anchored to
things that can be manipulated and measured objectively. The main
idea of operational behaviorism—that unobservable constructs are use-
ful and scientifically valid if they are systematically linked to behavioral
output—is accepted today by most parts of scientific psychology.
8. After World War II, psychologists began using the computer as a
metaphor for human nature, which led to the information process-
ing approach. In the 1980s, the connectionist approach began to use
networks of neurons in the brain as its inspiration and metaphor. Both
approaches are accepted and used today by modern students of learn-
ing in animals.
Learning Theory   37

9. Modern Learning Theory accepts an overarching framework that can be


used to analyze any example of human or animal behavior. Behaviors
(R) typically occur in the presence of stimuli (S) and precede significant
events or outcomes (O), like reinforcers. Several possible relations can
be learned between S, R, and O, and each can occur and play a power-
ful role. S-O learning is studied with the methods of classical condition-
ing, which, as we will see in later chapters, has many surprising features
and consequences. R-O learning is studied with the methods of operant
conditioning and is also fundamentally important. S may also signal the
R-O relation, may be connected directly with R, or may motivate be-
havior based on the R-O relation. In this book, we will consider what we
know about each of these kinds of learning and their interrelationships.

Discussion Questions
1. Contrast the way an associationist like Locke and a rationalist like Kant
viewed the mind. How are these points of view represented in what you
know about Learning Theory and/or other fields of psychology today?
2. Why did the early comparative psychologists and biologists who stud-
ied the reflex all come to focus on the conditioning experiment at the
turn of the 20th century? What is the role of the conditioning experi-
ment in psychology today?
3. What are the main ideas of “behaviorism?” Considering all the great
thinkers beginning with Descartes, who before Watson do you think
could be considered the earliest “behaviorist?”
4. Contrast the information processing approach with the connectionist
approach to cognitive psychology.
5. Describe two examples of S-O learning (classical conditioning) and two
examples of R-O learning (operant conditioning) that you have ob-
served in your own behavior, or that of your family or friends, over the
last few days.
38  Chapter 1

Key People and Key Terms


antecedent  19 instrumental radical behaviorism  17
association  7 conditioning  28 rationalism  8
atomistic  7 intervening variable  20 reflex action  5
British Empiricists  7 Immanuel Kant  8 reinforcer  18
classical John Locke  7 respondent  19
conditioning  28 law of effect  13 response learning  28
conditioned reflex  14 Learning Theory  4 S-O learning  29
connectionism  24 Morgan’s Canon  11 B. F. Skinner  17
consequence  19 neural networks  24 Skinner box  18
early comparative operant  18 stimulus learning  28
psychologists  10 operant structuralism  14
Charles Darwin  9 conditioning  28 tabula rasa  7
Julien de la Mettrie  6 operant experiment  18 theoretical
René Descartes  5 operant-respondent construct  20
elicited  19 distinction  19 Edward L.
hedonism  6 operational Thorndike  12
Thomas Hobbes  6 behaviorism  20 Edward C. Tolman  20
Clark L. Hull  21 parallel distributed
processing  24
David Hume  7
Ivan Pavlov  13
information
processing  23 R-O learning  29
Chapter Outline
Evolution and Behavior  42 Fear 59
Natural selection  42 Conditioning with drugs as the
Adaptation in behavior  42 outcome 60
Fixed action patterns  44 Sign tracking  63
Innate behavior  45 Other Parallels Between Signal
Habituation 47
and Response Learning  64
Adaptation and Learning: Extinction 64
Instrumental Conditioning  50 Timing of the outcome  66
The law of effect  51 Size of the outcome  69
Reinforcement 52 Preparedness 70
Shaping 52 Summary 74
Adaptation and Learning:
Discussion Questions  75
Classical Conditioning  54
Signals for food  55 Key Terms  76
Territoriality and reproduction  56
chapter

1
2

Learning and Adaptation

IntoChapter 1, we saw how researchers eventually came


emphasize two basic forms of animal learning: clas-
sical conditioning, in which animals learn to associate
stimuli with outcomes, and operant (or instrumental)
conditioning, in which they learn to associate behav-
iors with outcomes. In later chapters, we will look at
how these kinds of learning actually operate. Before
we do, however, it is worth asking why they are so
prevalent and important. The main reason is that both
kinds of learning give humans and other animals a way
to adapt to their changing environments. This rather
simple idea has some surprisingly interesting implica-
tions. In the first sections of this chapter, some of the
adaptive functions of behavior and learning will be
considered. In the last part of the chapter, I will show
that—largely because of their overlap in function—
both classical conditioning and operant conditioning
follow similar rules. Throughout, I hope you will begin
to appreciate the surprisingly wide range of behavior
that is influenced by classical conditioning and oper-
ant conditioning and why they are worth knowing and
thinking about.
We will start with the idea that learning is mainly a
way in which animals adapt to their environments. To
be more specific, it is a way that they can adapt through
experience. It is interesting to observe that this kind of
adaptation nicely complements another, more famous,
adaptation process: Evolution by natural selection.
42  Chapter 2

Evolution and Behavior


Natural selection
Consider the story of how a population of moths (pepper moths) changed
color from black-and-white speckled to more purely black as England be-
came industrialized during the 1800s (see Kettlewell, 1956, for a review).
During this period, air pollution began to darken the trees, turning them
from speckled white to black. As a result, predatory birds found it easier to
spot light moths that were resting on the trees, and light moths were eaten
more readily than dark moths. The more conspicuous, lighter moths were
gradually eliminated (see also Howlett & Majerus, 1987, as well as studies
with blue jays by Bond & Kamil, 1998, 2002, 2006—discussed in Chapter
8—for further evidence of such a process). The dark moths survived and
passed their dark genes on to offspring, and the dark offspring in turn sur-
vived. Over generations, the population of moths became darker; in fact,
the moths appeared to have adjusted to the new, darker, environment. No
force or higher intelligence was necessary to make these insects turn from
speckled white to black. Instead, lighter moths were simply at a disadvan-
tage, and lighter pigmentation was eliminated from the population. This,
of course, is an example of evolution through natural selection, a process
that is also illustrated and supported by studies of finches on the Galápagos
Islands (e.g., Grant & Grant, 2002). For example, after a major drought, the
supply of edible seeds decreased dramatically, leaving only tough, difficult-
to-crack seeds. Only birds with larger beaks could eat them, and there was
an increase in the average beak size of birds in the next few generations.
In evolution, the bottom line is reproductive success. If the dark moths
(or finches with sturdy beaks) had survived to live longer lives but failed
to produce offspring and create a new generation, the process would have
gone nowhere. In the long run, the winners in evolution are the individuals
who pass their genes along to the next generation. These individuals are
“fit.” Fitness describes an animal’s ability to produce offspring that will
reproduce in the next generation. Evolutionary thinking has increasingly
focused on the survival of the organism’s genes (e.g., Hamilton, 1964; Wil-
liams, 1966). To paraphrase E. O. Wilson (1975), the organism is essentially
the gene’s way of making more genes. For an entertaining and insightful
discussion of these issues, I recommend Richard Dawkins’s book, The Selfish
Gene (Dawkins, 1989). For now, it is important to remember that natural
selection works because individuals with adaptive traits are more likely
to reproduce than those without such traits and that their genes will be
represented in future generations.

Adaptation in behavior
For the moths in industrialized England, being dark had clear survival
value. There was some initial genetic variation in this trait, and the blacker
genes then survived. Blackness was represented in the next generation
because blackness was inherited. If behaviors are likewise linked to genes,
Learning and Adaptation   43

they could likewise evolve through natural selection. That is the main idea
of ethology, the study of the adaptiveness and evolution of behavior. Like
coloration or body parts, behaviors themselves may evolve.
Because of their interest in the evolution of behavior, ethologists study
behavior as it occurs in nature. Only then can its adaptation to the environ-
ment be understood. The actual evolution of a behavior is not possible to
observe because behavior (unlike bones) does not leave fossils. Ethologists
have nonetheless worked out a number of interesting methods for making
inferences about how behavior has evolved (e.g., Alcock, 2013; Barash,
1982; McFarland, 1993). Sometimes a plausible evolutionary path can be
put together by comparing the behavior of related species. For example,
Kessel (1955) studied the evolution of an interesting aspect of courtship
behavior in empidid flies. In some species, the male gives an empty silk bal-
loon to the female during courtship. Where did this behavior come from?
In related species, the male might give a juicy insect to the female, which
she eats during mating. In other species, the prey item is adorned in silk;
in still others, the prey item is wrapped completely. Kessel suggested that
these simpler and more common behaviors evolved into the unusual be-
havior of the male empidid fly presenting an empty balloon to the female.
More typically, ethologists work out the evolution of a behavior by
evaluating its possible benefit or survival value. In the case of the simplest
empidid fly, the male may benefit from bringing a prey item to his mate
because it gives her something to eat during courtship besides him! The
“evaluative approach” (Barash, 1982) is most powerful when real evidence
is obtained that confirms the hypothetical benefit to the animal. This point
was made beautifully by Niko Tinbergen, one of the most important of
the early ethologists. While studying black-headed gulls in the 1950s, he
noticed that parent birds were quick to remove broken eggshells from the
nest once their chicks had hatched. He hypothesized that this behavior was
adaptive because the egg shells were white on the inside, and broken ones
lying around the nest would be easy for predators to detect. By removing
the broken eggshells, the parents were protecting their young. To test the
idea, Tinbergen (1963) conducted experiments in which eggs were strewn
around a field, with broken egg shells lying near some of the eggs. The
eggs were subject to predation from crows and other gulls. Consistent with
his hypothesis, the eggs that were near broken egg shells were eaten more
often than eggs that were not (Figure 2.1). The experiments established a
clear payoff for eggshell removal by gull parents.
Tinbergen’s hypothesis was also supported by the comparison of black-
headed gulls with other species. Ester Cullen (1957) compared the behavior
of the black-headed gull with the behavior of the kittiwake, a closely related
species. Unlike the black-headed gull, which nests in large colonies on the
ground, the kittiwake nests on steep cliffs, where its eggs and chicks are
less bothered by natural predators. Interestingly, kittiwakes do not practice
eggshell removal. They also do not perform other antipredator behaviors
practiced by black-headed gulls, such as alarm calling when predators are
44  Chapter 2

Distance between egg and broken shell Eggs Eggs


taken not taken

15 cm 63 87

100 cm 48 102

200 cm 32 118

Figure 2.1  Eggshell removal in herring gulls. Predators are more likely to
spot eggs and eat them if they are near a broken eggshell than if they are not.
There is thus an evolutionary “payoff” for removing eggshells from the nest.
(After Tinbergen, 1963.)

in the area or predator “mobbing.” (In mobbing, a group of gulls gather and
screech loudly at the predator to drive it away.) By studying related species
in different environments, we can understand how behavior has adapted
to those environments. Unrelated species in similar environments can also
be compared (see Alcock, 2013). This sort of study provides interesting in-
sights into how behavior evolves and adapts an animal to its environment.

Fixed action patterns


When ethologists examine the behavior of animals in the natural world,
they often uncover what Konrad Lorenz called fixed action patterns.
Fixed action patterns are fixed behavior sequences that are triggered by
stimuli known as releasers or sign stimuli. They are supposed to have
several characteristics. Fixed action patterns are highly stereotyped (they
do not vary between individuals or between occasions when one individual
performs them). They also depend on initial triggering only. Most impor-
tant is that they are not supposed to depend on learning, which means that
they appear in their typical form when the animal first performs them.
Fixed action patterns are supposed to be built into the genes—innate—just
as morphological characteristics such as wings and eye color are.
One of the most impressive examples of a fixed action pattern is cocoon-
building behavior in the spider Cupiennius salei (Eibl-Eibesfeldt, 1970). The
female first spins the base of the cocoon and then its walls; after deposit-
ing the eggs inside, she closes the top. Remarkably, the sequence always
takes about 6,400 movements. That is true even if the cocoon is destroyed
halfway through the task: The spider keeps going where she left off. If
the glands producing the cocoon’s threads fail to function (as they did at
least once when hot lights used in filming the process dried them out),
the spider still performs the whole sequence. After initial triggering, the
Learning and Adaptation   45

Figure 2.2  The eyebrow flash in Bali (top row) and Papua New Guinea (middle
and bottom rows). The sequences shown occurred in well under 2 seconds.
(From Eibl-Eibesfeldt, 1970.)

sequence just keeps on going. A fixed action pattern does not depend on
feedback once it is set in motion.
Eibl-Eibesfeldt (1970, 1979, 1989) discussed a number of possible fixed
action patterns in humans. For example, humans from every culture that
have been studied (from Western Europe to Papua New Guinea to Samoa)
appear to smile. They also flash their eyebrows at the beginning of friendly
social greetings (Figure 2.2). The fact that eyebrow flashing and smiling
are so stereotyped and appear to occur over such a wide range of cultures
(with learning environments that presumably differ quite substantially)
suggests to Eibl-Eibesfeldt that they may be innate fixed action patterns.

Innate behavior
In truth, the “innateness” of fixed action patterns is more often assumed
than really proven. Even a learned behavior can be highly stereotyped if
46  Chapter 2

crucial features of the environment are similar enough between individu-


als. In the strict sense, it is surprisingly difficult to prove that a behavior
is “innate.” Ethologists once thought that a behavior’s innateness could
be proven by running “deprivation experiments” in which they raised
the animal from birth in an environment that lacked experiences that
are assumed necessary for the behavior to be learned. If the behavior
still emerged during development, it must be innate. Unfortunately, de-
privation experiments can never be conclusive because an experimenter
can never deprive an animal of all experience; after all, it is never pos-
sible to know for sure that the animal has been deprived of all the right
experiences.
There are better methods to study innateness. In artificial selection
experiments, researchers study whether behaviors can be passed from
generation to generation by only allowing animals that show a specific
behavior to interbreed. Many behaviors have been selected this way (e.g.,
Cade, 1981; Hirsch, 1963). For psychologists, some of the best-known
work of this type was conducted by Tryon (1942), who ran rats in mazes
and separately mated those that learned quickly and those that did not.
Offspring of “bright” parents were bright, offspring of “dull” parents
were dull, and the difference between the lines increased over succeeding
generations. Although the results suggest that maze-brightness and maze-
dullness can be inherited, the difference between lines also appeared to
depend on the kind of environment in which the rats were raised. Cooper
and Zubek (1958) found that rats of the bright lines and dull lines differed
when they were raised in “normal” laboratory cage, but when they were
raised in either enriched or impoverished cages, the difference between
bright rats and dull rats disappeared. The expression of genetic potential
in behavior may often interact with the type of experience offered by
particular environments.
For many reasons, then, the term innate is probably best used to refer
only to behaviors that have no obvious basis in learning (McFarland, 1993).
Interestingly, even behaviors that meet this definition can be modified to
some extent by experience over time. For example, we will soon see that
classical conditioning allows “innate” behavior to be evoked by cues that
merely signal sign stimuli. Conditioning thus allows innate behaviors to be
released by a wider range of stimuli. Innate behavior can also be modified
by instrumental conditioning. For example, Tinbergen and Perdeck (1950)
showed that herring gull chicks peck at their parents’ beaks to make the
parents regurgitate food, which the chicks then eat with some enthusiasm.
The chicks’ pecking appears immediately after hatching and is highly simi-
lar between birds. It is also released only by highly specific sign stimuli.
Hailman (1967) noticed that the similar pecking of young laughing gulls
became more accurate and efficient as the birds grew older. Part of this
change is due to maturation, but part of it is because some forms of the
response are more successful than others in producing food. In other words,
the behavior is affected by response learning.
Learning and Adaptation   47

(A) (B)
Calls as percentage of first response 100 6

Trials with an orienting response


4

50 3

0 0
1 5 10 1 5 10 15 20
Presentation Daily sessions of 8 exposures

(C)
50 Figure 2.3  Habituation happens when
organisms are repeatedly exposed to
Startle responses (%)

40
a stimulus that elicits a response. (A)
Calling by chaffinches in response to
30
an owl. (B) Orienting in rats toward
20 a light. (C) Startle reactions in rats
to sudden bursts of noise. (A, after
10 Hinde, 1970; B, after Hall & Channell,
1985; C, after Marlin & Miller, 1981.)
0
2 4 6 8 10 2 4 6 8 10
Session 1 Session 2
Blocks of 80 trials

Habituation
Experience can modify “innate” behaviors in yet other ways. One very
common feature of behavior is that it often shows habituation. When a
sign stimulus is presented repeatedly, the strength of the response often
declines. For example, young birds are initially frightened when a shadow
flies over them, but the fear response decreases with repeated exposure to
the shadow (e.g., Schleidt, 1961). Likewise, mobbing behavior decreases
with repeated exposure to the eliciting stimulus (e.g., Hinde, 1970; Figure
2.3). A rat will orient to a light the first few times it is presented, but this
orienting response likewise declines with repeated presentation. Similarly,
many organisms are startled by presentation of a sudden burst of noise; if
the noise is presented repeatedly, the startle response also becomes habitu-
ated (e.g., Davis, 1970; Groves & Thompson, 1970; Marlin & Miller, 1981).
48  Chapter 2

Habituation is extraordinarily common across species, stimuli, and be-


haviors. It presumably prevents the animal from wasting time and energy on
behaviors that are not necessarily functional. It has a number of important
characteristics (see Groves & Thompson, 1970; Rankin et al., 2009; Thomp-
son & Spencer, 1966) that are worth describing here. For example, habitua-
tion is not permanent; the habituated response tends to return or “recover”
spontaneously over time. This spontaneous recovery (after habitation)
effect is illustrated in Figure 2.3C, where you can see that the habituated
startle response returned, at least partly, after the end of Session 1 and the
start of Session 2, which began an hour later (Marlin & Miller, 1981).
Leonard Epstein and his colleagues have done some interesting research
on habituation of responding to food in humans (see Epstein, Temple, Ro-
emmich, & Bouton, 2009, for a review). Figure 2.4 illustrates some of this
work. It shows the amount of salivation elicited by a drop of either lemon
Leonard Epstein
juice or lime juice that was put on women’s tongues on each of several trials
(Epstein, Rodefer, Wisniewski, & Caggiula, 1992). The first drops elicited
a lot of salivation, but over 10 trials, the amount of responding clearly de-
creased. Then the experimenters presented a drop of the other juice (lime to
lemon-habituated participants and lemon to lime-habituated participants).
Responding to the other juice is not shown. But on the final trial, a drop of
the original juice was put on the tongue again. No-
tice that salivation returned. This recovery of the
5
habituated response after exposure to a different
stimulus is known as dishabituation. Dishabitu-
ation demonstrates that habituation itself is not
4 Present
another merely fatigue. The response had decreased over
stimulus trials, but the subjects could clearly still respond
to the juice.
3
Salivation (g)

Habituation is also clearly stimulus-specific.


For example, Figure 2.5 shows the responses of
8- to 12-year-old children to food stimuli over re-
2
peated exposures to them (Epstein et al., 2003). In
one experiment, the children were exposed to the
odor of a fresh piece of cheeseburger on several
1 trials and then were exposed to a piece of apple
pie (Figure 2.5A). As you can see, salivation to
the burger habituated, but when apple pie odor
0 was presented, the kids salivated once again. In
2 4 6 8 10 12
Trials another experiment (Figure 2.5B), children moved
a computer joystick to earn points that could be
Figure 2.4  Habituation of the salivary
response to juice put on the tongue of
exchanged for a piece of cheeseburger. (This activ-
humans. Trial 12 occurred after an ex- ity is actually an instrumental learning task.) The
posure to a different juice (not shown); responses for cheeseburger were high at the start
the increase in the response illustrates of the session, but gradually decreased because of
dishabituation. (After Epstein, Rodefer, habituation to the food. When pie was then made
Wisniewski, & Caggiula, 1992.) available, though, responding started up again.
Learning and Adaptation   49

(A) Salivation (B) Motivated responses for food


0.6 60
New
0.5 food
50
Salivation change from baseline (g)

Responses per 2 minutes


0.4
40
0.3
New
food 30
0.2

0.1 20

0.0
10
–0.1
0
–0.2
0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 12 13
Trials Trials

Figure 2.5  Habituation is stimulus-specific. (A) Salivary response to cheese-


burger and then apple pie in children. (B) Motivated instrumental responding
for cheeseburger and then apple pie in children. (After Epstein et al., 2003.)

In both of these experiments, habituation to cheeseburger did not prevent


responding for the new food stimulus. The fact that habituation to food
shows stimulus-specificity might explain why many of us find it hard to
resist dessert at the end of a filling meal.
Eating is influenced by a large number of factors (e.g., see Woods, 2009;
see also Chapter 9), but the idea that habituation is involved has many
implications. As just noted, the stimulus-specificity of habituation may be
one reason why desserts are often so tempting. And because of habitua-
tion, a meal that contains a variety of foods will result in more food intake
than a meal that consists of only one food (Temple, Giacomelli, Roem-
mich, & Epstein, 2008). Interestingly, adults and children who are obese
are slower to habituate to food than their leaner peers (Epstein, Paluch,
& Coleman, 1996; Epstein et al., 2008; Temple, Giacomelli, Roemmich, &
Epstein, 2007). It is hard to know whether the slow habituation is a cause
or a result of the obesity. But the rate at which a child habituates to food
in the lab can actually predict how much weight he or she will gain in
the coming year (Epstein, Robinson, Roemmich, & Marusewski, 2011).
Research on habituation thus has many interesting implications for how
we react to and eat food.
Habituation is not the only possible result of repeated exposure to a
stimulus. With intense or moderately intense stimuli, sensitization can
also occur. Sensitization is an increase (instead of a decrease) in respond-
ing over a series of repeated exposures to the same stimulus. The effect
50  Chapter 2

sometimes occurs during early trials in habituation experiments. For ex-


ample, sensitization can be seen in Figure 2.5B, where the children initially
increased their responding for cheeseburger over the first several trials,
as if the first exposures to cheeseburger got them excited about the food.
Ultimately, however, habituation kicked in and the amount of responding
decreased. According to one view, the dual-process theory of habituation
(Groves & Thompson, 1970), exposure to a stimulus actually engages both
an habituation process (which decreases responsiveness to the stimulus
over trials) and a sensitization process (which increases responsiveness
to stimuli over trials). The sensitization process creates a state of arousal
or excitability. The level of responding that one observes on any trial will
depend on the relative strengths of these two processes at the time. We will
have more to say about habituation and sensitization later in this book (see
Chapters 4 and 9).
As I noted earlier, habituation is adaptive because it prevents the organ-
ism from wasting energy responding to a repeatedly presented stimulus.
The ability to change behavior with experience is often advantageous, and
in the next sections we will see that this is clearly true in instrumental and
classical conditioning. But, given all the advantages of flexibility and learn-
ing, why should any behavior ever evolve that is fixed and innate? The
answer is that learning can sometimes carry serious costs. For example, a
mouse would clearly benefit from innately recognizing and defending itself
against a predator the first time one is encountered; a failure to respond
correctly on Trial 1 would likely mean the end of the game (e.g., Bolles,
1970; Edmunds, 1974; Hirsch & Bolles, 1980). Innate behaviors are there
the first time the animal needs them. Evolution by natural selection allows
behaviors to adapt across generations of animals. Learning, on the other
hand, allows behaviors to adapt through experience within an animal’s
own lifetime.

Adaptation and Learning: Instrumental Conditioning


Thanks to instrumental conditioning, animals learn to perform behaviors that
increase their fitness. Food, water, and sex are classic rewards that reinforce
learning; it is no coincidence that they are also important for reproductive
success. The importance of other rewards is less obvious. Humans are social
animals who depend on mutual cooperation; that is presumably why social
stimuli such as smiles and approval seem to serve as such powerful rewards.
Consider the relationship between a parent and a baby. The baby needs an
adult’s help to survive, but in an evolutionary sense, the parent also gains
fitness by helping the child (related genes survive). Accordingly, the par-
ent reinforces appropriate behavior in the infant with social and comfort
rewards. On the other hand, the infant also controls reinforcers that keep
her parents attentive; for example, she smiles when she feels satisfied. (Any
parent of a 2-month-old will tell you how pleasing—and reinforcing—an
Learning and Adaptation   51

infant’s smile is.) The smile promotes bonding and reinforces nurturing be-
havior. The point is that instrumental conditioning is organized here so that
both parties get the things they need to survive and thrive.
The adaptive nature of instrumental conditioning is also illustrated by
a bit of animal behavior mentioned in the last chapter: Zach’s (1978, 1979)
study of crows foraging on the beach in British Columbia. Recall that the
crows select only large whelks from the beach and then drop them onto
preselected rocks from a height of a little over 5 meters. Zach (1979) actu-
ally performed some calculations that suggest that the crows had arrived
at an optimal solution to eating whelks. The crows learned to choose only
whelks whose shells tend to break easily (the big whelks), and the crows
tend to drop the whelks from a height that balances the energetic cost of
lifting them against the height necessary to break their shells. In this ex-
ample, animals learned a complex behavior that seems almost perfectly
in tune with the environment. Thanks to learning, the crow’s behavior is
beautifully adapted to the task of feeding on whelks.

The law of effect


In instrumental conditioning, we are always concerned with the relationship
between a behavior (R) and a biologically significant event or outcome (O).
At an extremely general level, it is easy to show that this learning allows
adaptation to the environment. Outcomes can be either good or bad (i.e.,
having a positive or negative survival value), and Figure 2.6 shows that
behaviors can produce or prevent either type of O. The most famous relation,
of course, is the one in which a behavior produces a good outcome, such as
food, water, or a mate. When a behavior produces a good O, the behavior
usually increases in strength. This particular case is called reward learning.
In other cases, behaviors can lead to outcomes
that are bad or painful. Here, behavior decreases Type of Outcome
in strength. The effect and the procedure are both Good Bad
known as punishment; the bad outcome in this case Outcome Outcome

is called a punisher. Punishment is quite common,


Produces
Outcome
Effect of behavior (R)

and it can have powerful effects on behavior. Reward Punishment


There are two other fundamental behavior-
outcome relations. When a behavior prevents a
bad event from occurring, the behavior increases
Outcome
Prevents

in strength; this is called avoidance learning. If a Omission


Avoidance
Escape
behavior terminates a nasty event that is already
present, it will also increase; this is called escape
learning. The final cell in Figure 2.6 describes a situ-
Figure 2.6  The law of effect. Inside
ation in which a behavior prevents the occurrence
each cell is the type of learning that
of a good outcome. This relation is sometimes called results from each combination of an
omission training. Not surprisingly, this sort of ar- effect of behavior, R (left), and a type
rangement usually causes a decrease in the strength of outcome, O (top). Arrows indi-
of the response. cate whether behavior increases or
decreases in strength.
52  Chapter 2

Figure 2.6 illustrates that instrumental behavior increases or decreases


depending on its effect on the environment. As you may remember from
Chapter 1, this general rule is called the law of effect. (Many other ver-
sions of the law emphasize only the reward and punishment cells in Figure
2.6.) The law reminds us that instrumental learning generally works so that
animals will maximize benefits (good Os) and minimize costs (bad Os). It
does not tell us how behavior comes to increase or decrease in strength in
the various cells (see Figure 2.6); that will be the subject of later chapters.
For now, I only suggest that the law of effect operates so that behavior can
adapt to the current environment.

Reinforcement
We use the term reinforcement to describe a situation in which a relation
between an R and an O increases the strength of the response. Reinforcement
means strengthening. Both reward learning and avoidance (or escape) learn-
ing involve a strengthening of behavior. Because reward learning involves
the addition of a (good) O, it is often known as positive reinforcement.
Because avoidance and escape learning involve subtraction of a (bad) event,
they are examples of negative reinforcement. The term negative describes
the removal or subtraction of a stimulus or event, not a negative effect on
behavior. Negative reinforcement is not the same as punishment.

Shaping
Brand new behaviors can be added to an animal’s repertoire through a
process called shaping. In shaping, behaviors that initially do not exist
emerge when approximations of them are reinforced. Many students learn
about shaping when they are given a rat in a psychology lab course and are
asked to teach the rat to press a lever in a Skinner box to get a food reward.
There is always an immediate difficulty because the rat does not press the
lever at first; it is impossible to deliver a reinforcer after a behavior that
never happens. The trick is to reinforce closer and closer approximations of
the lever-press response. Typically, the student first delivers food to the rat
when the rat is anywhere near the lever. Being in the vicinity of the lever
therefore increases in frequency, and once it is happening fairly often, the
student requires the rat to actually position itself above the lever to earn
the next pellet. Once the animal is hovering over the lever, it is more likely
that the lever will pressed accidentally. Usually it is, and as a result, food
pellets are delivered automatically, the rat begins to press the lever at a
steady rate, and the student is able to go out and seek reinforcers of his or
her own. By reinforcing successive approximations of the response, it is
possible to increase the likelihood of the response so that simple reward
learning can take over.
Shaping is quite familiar in humans. For instance, a baby is presumably
taught to say mommy and daddy through such a process. Parents tend to
reinforce approximations of these words (ma or da-da) when they occur,
with praise, smiling, and so forth. At the same time, other inappropriate
Learning and Adaptation   53

versions are not reinforced. Gradually, the words are spoken with better
and better clarity. Explicit shaping techniques have also been used with
people diagnosed with autism or schizophrenia and with those who are
developmentally disabled. For example, Lovaas (1967) has developed tech-
niques for shaping speech in autistic children, and Silverstein and his col-
leagues (2009) have shaped patients with schizophrenia to increase their
general attentiveness. Shaping methods have also been used to reinforce
smokers and cocaine users for gradually quitting (e.g., Lamb, Kirby, Mor-
ral, Galbicka, & Iguchi, 2010; Preston, Umbricht, Wong, & Epstein, 2001).
Shaping can be described in a slightly more formal way. Suppose that
a rat initially learns to press a lever with an average of about 20 grams of
force, but you want it to learn to press the lever with a force of 80 grams.
Usually, the rat will make a number of presses with forces greater than and
less than 20 grams. A possible distribution is illustrated in Figure 2.7 (curve
1). Notice that a response of 80 grams does not occur at this point. It would
be possible to shape that response by reinforcing only extreme scores (the
shaded area of the distribution). Doing so causes the distribution to shift
to the right because responses with a weight below the new requirement
are not reinforced and become “extinguished.” Responses greater than the
new requirement also tend to occur because the rat is not very precise; its
responses tend to generalize to different values, a phenomenon known as
“induction.” At this stage, a new requirement can be introduced, and the
distribution again shifts accordingly. The process is repeated until an 80-
gram response, which originally had no strength, is performed quite regu-
larly. By reinforcing successive approximations of the 80-gram response, a
new behavior is added to the rat’s repertoire.

Lever presses
reinforced
Frequency of behavior

1 2 3 4

0 10 20 30 40 50 60 70 80 90 100
Weight used to press lever (g)

Figure 2.7  Shaping introduces new behaviors. Initially, a rat might press a lever
with an average of 20 grams of force (curve 1). By reinforcing successive approx-
imations of a heavier response (shaded areas), new behaviors are introduced
and others are extinguished (curves 2–4).
54  Chapter 2

Figure 2.7 could easily be modified to describe learning in high-jumpers


or pole-vaulters who are being reinforced to jump higher and higher. Of
course, the principles of shaping also apply to other behaviors that do
not line up so conveniently on a single dimension. The point is that by
selectively reinforcing some behaviors and not others, the distribution of
behavior can change so that new acts are introduced. Acquisition of any
complex behavior could follow the kind of principle illustrated in Figure
2.7.
Shaping does not require a coach or a teacher. It can happen more
haphazardly in the natural environment. Consider again Zach’s crows
as they forage on whelks. At the beginning, a crow might dig among
whelks and find one whose shell has been cracked open by waves as
they crash on the rocks. The snack might reinforce continued digging
by the crow among the whelks. This could increase the likelihood that a
whelk is lifted, its shell cracks, and lifting behaviors increase until one is
lifted to a higher point and an optimum height is reached. Alternatively,
a crow digging among the whelks could fly away with a whelk to escape
being pestered by another crow. The whelk could be dropped and break
its shell, causing the behavior to be repeated, and differential reinforce-
ment would shape the final solution. A variety of scenarios are possible.
The point is that shaping does not require the forethought of a teacher;
it occurs all the time in the natural world. It naturally allows the animal
to adapt to its environment.
Shaping actually works like evolution by natural selection. In fact,
Figure 2.7 could have been taken from a textbook illustrating the natural
selection of a trait over generations of animals. If we put “darkness of
moths” on the x-axis, we would have a graph of the moths-becoming-
darker-in-industrialized-England example that we considered earlier. In
this case, the distribution would shift over generations because extremes
of the distribution have successful offspring; the offspring have a higher
distribution of their own. (By the way, shaping and evolution do not just
select directionally; e.g., they could just as easily select against the extremes
of the frequency distribution.) Both evolution and instrumental condition-
ing involve processes of extinction and shifts in distributions. The parallel
has been noted many times before (e.g., Staddon & Simmelhag, 1971). As B.
F. Skinner liked to note in the latter part of his career, both natural selection
and operant conditioning involve “selection by consequences” (Skinner,
1981). In evolution, the consequence is reproductive success. In instrumen-
tal conditioning, the consequence is an Outcome. By giving animals the
ability to learn through instrumental conditioning, nature has given them
an extremely important way to adapt their behavior to the environment.

Adaptation and Learning: Classical Conditioning


Classical conditioning is another way in which learning allows adapta-
tion. The point is similar to the one just made about instrumental condi-
Learning and Adaptation   55

tioning. By learning that a signal predicts an outcome, an animal is often


able to make a response to the signal before the O actually happens. In
this way, the conditioned response can allow the animal to prepare for
the upcoming O (Domjan, 2005; Hollis, 1982, 1997). Let’s take a look at
some examples.

Signals for food


First consider Pavlov’s original experiment with drooling dogs. It is very
easy to overlook that the response that occurs to the signal—salivation—
functions to help the dog digest the food. By drooling before food enters
the mouth, the dog gets ready to break down the food more readily and
digest it more efficiently. It is also easy to miss that salivation was only
part of what the dog was probably doing in Pavlov’s experiment; it was
just the tip of the iceberg. Pavlovian signals for food also elicit many other
anticipatory, functional responses: Gastric juices begin to flow, enzymes
from the pancreas are secreted, and the hormone insulin—needed to get
nutrients from the blood into the body’s cells—is secreted. Each is part of
a whole constellation of responses that together can help the dog digest
the anticipated food. True to this idea, animals that are given signals for
their meals, and therefore have the opportunity to learn to respond this
way, are better at gaining nutrients from the meal (Powley, 1977; Woods
& Strubbe, 1994).
Classical conditioning also helps animals identify and avoid foods that
contain poisonous substances. Rats eat many meals a day, very opportu-
nistically, and they need to learn about all the new foods they find and
eat. Some foods may contain poisons that can make the animal sick, and
as mentioned in Chapter 1, Pavlovian learning provides a way in which
they can learn to avoid these foods. In the 1960s, some important experi-
ments by John Garcia and his associates established that rats will learn to
reject food flavors that are associated with illness (e.g., Garcia, Ervin, &
John Garcia
Koelling, 1966; Garcia & Koelling, 1966). If the rat is given a novel flavor
(like vinegar) in a drink and is then made sick (usually with exposure to a
drug), the rat will quickly come to reject the vinegar flavor when it is again
offered. The rat learns an “aversion” to the vinegar flavor, a phenomenon
known as taste aversion learning. Taste aversion learning takes the form
of a learned rejection of a flavor (S) that is associated with gastric illness (O).
Humans learn taste aversions, too (e.g., Garb & Stunkard, 1974; Logue,
1985; see also Lamon, Wilson, & Leaf, 1977). I know two or three people
who will not go near tequila because of a conditioning experience they
had during late adolescence. Tequila is a liquor with a pungent flavor that
is sometimes consumed at parties in a way that almost guarantees a taste
aversion learning trial (alcohol poisoning). As a consequence of tasting
tequila before getting sick, my friends have acquired a strong dislike for
the flavor. Taste aversion has many interesting features (see Chapter 6), but
for now you can see that it functions to help humans and other animals
learn about—and stay away from—bad food and drink.
56  Chapter 2

The “dislike” we learn for foods associated with illness is complemented


by a kind of “liking” that is learned for foods associated with good things.
Rats will learn to prefer a neutral flavor (like vanilla) if it is associated
with the sweet taste of sucrose or saccharin (Fanselow & Birk, 1982). They
may also learn to like flavors associated with their recovery from illness
(Zahorik & Maier, 1969) and flavors that are associated with nutrients that
are important to them, such as calories (e.g., Bolles, Hayward, & Crandall,
1981; Capaldi, Campbell, Sheffer, & Bradford, 1987; Mehiel & Bolles, 1984;
Sclafani, 1995). Interestingly, the preference that rats learn for flavors as-
sociated with calories is affected by how hungry or satiated they are. When
the rat is deprived of food, it has a stronger preference for a flavor that has
been associated with calories than when it is satiated (Fedorchak & Bolles,
1987; see also Fedorchak & Bolles, 1988). By allowing us to learn to like the
tastes of foods that contain useful, positive Os, classical conditioning once
again helps us adapt to our environment. Through it we learn to respond
appropriately to foods we have experienced before.

Territoriality and reproduction


Feeding is one example of a behavior that is clearly important to survival
and fitness. Other kinds of behavior are also important. Through sexual
behavior, an animal’s genes are transmitted to future generations, and
through aggressive behavior, animals are able to defend their resources.
Not surprisingly, classical conditioning is involved in these situations, and
it is not difficult to see how important it is to the animal’s fitness.
Karen Hollis has done some fascinating work on conditioning in the
blue gourami (Trichogaster trichopterus), a freshwater fish native to south-
east Asia and Africa (see, e.g., Hollis, 1990). Male gouramis establish their
territories at about the time of the monsoons. They then defend them ag-
gressively against other males, but will allow females to enter and deposit
eggs in nests that the males have made within the territory. Hollis has
shown the advantages of conditioning in both territorial aggression and
sexual behavior. In one set of experiments (Hollis, 1984), two males were
Karen Hollis
housed at opposite ends of an aquarium but were never allowed to see
each other. Over several days, one of them received trials in which a light
was turned on before a rival male was presented (actual fighting between
the fish was prevented by a glass barrier) so as to establish the light as a
Pavlovian cue or signal for the rival male. The other member of the pair
received the same lights and exposure to a rival, but never together in time.
Over trials, the signal came to elicit aggressive displays in the first fish. The
interesting test, though, came when the two fish—on opposite sides of the
aquarium—met each other for the first time. For both fish, the light was
presented just before the encounter. As Figure 2.8 shows, the fish whose
light had previously signaled the arrival of another male showed more
aggressive biting and tailbeating than the other fish. Had they actually
been allowed to mix, the fish that had the light signal would have won
Learning and Adaptation   57

Pavlovian
50 10 Unpaired

40 8

Mean number
30 6

20 4

10 2

0 0
Bites Tailbeats

Figure 2.8  When a male blue gourami receives a Pavlovian cue signaling a
rival male, he exhibits more aggressive biting and tailbeating than does a male
that receives a control stimulus. The light signal gets the male ready for territo-
rial defense. (After Hollis, 1984.)

the fight. There was a clear value to having a light signal the approach of
a rival before the fight.
When a female instead of a male enters the territory, the male gourami
needs to switch from aggression to a sexy mood. In fact, after some initial
aggressive displaying, males in their natural environment do eventually
perform a behavior that brings the entering female toward the nest. This
behavior can also be conditioned. For example, in another experiment (Hol-
lis, Cadieux, & Colbert, 1989), male-female pairs saw each other regularly
for several days. For some pairs, each exposure was signaled by a red light.
Other pairs received similar exposures to each other, and similar lights, but
exposures and lights were never paired. In a crucial test, all fish received
their signals immediately before an encounter. The signal had a clear ef-
fect on the males. When the red light had previously signaled a female, it
reduced the male’s aggressive biting and increased the time he spent in a
courtship appeasement posture (Figure 2.9). There was less of an effect
on the females (not shown) than on the males, although there were other
signs that they had learned a signal-mate relationship. The point of these
results, though, is that a signal for a mate (as opposed to a rival) prepares
the male for reproduction rather than fighting. Amazingly, males that an-
ticipate the arrival of a female this way before copulation also spend more
time building a nest, spawn quicker, and even produce more young than
58  Chapter 2

Figure 2.9  When a male blue 50 50


gourami receives a Pavlovian cue Pavlovian
signaling the presence of a fe- Unpaired
male, aggressive biting is reduced 40 40
and time in a courtship appease-
ment posture is increased. The

Mean duration (s)


signal gets the male ready for 30 30

Mean number
courtship. (After Hollis et al.,
1989.)
20 20

10 10

0 0
Bites Courtship appeasement

males without the signal (Hollis, Pharr, Dumas, Britton, & Field, 1997). This
effect of the signal on actual reproductive success is a dramatic illustration
of how conditioning can enhance fitness.
Conditioning of reproductive behavior has been shown in a number
of species (Domjan & Hollis, 1988). Michael Domjan and his collaborators
did some related work with Japanese quail (Coturnix coturnix japonica) (see
Domjan, 1994, 1997 for reviews). Males of this species readily learn about
stimuli that signal an opportunity to mate. In some experiments (Domjan,
Lyons, North, & Bruell, 1986), a red light came on before the male was
allowed to copulate with a female that had been released into his cage. A
control group had the same exposure to the light and copulation, but the
two events were never paired together. The first group of males quickly
learned to approach and stay in the vicinity of the light when it was turned
on (Figure 2.10). They were also quicker than the other group to grab
and mount the female when she was presented. Once again, the signal
prepared the male for presentation of the female. Male quail that receive a
signal may actually release more spermatozoa than do control males during
copulation (Domjan, Blesbois, & Williams, 1998). They are also more likely
to fertilize the female’s eggs (Matthews, Domjan, Ramsey, & Crews, 2007;
see also Domjan, Mahometa, & Matthews, 2012). Interestingly, although
there is less evidence of conditioning in the female quail, partly because
females seem to be less interested than males in staying near a member of
the opposite sex (Domjan & Hall, 1986), egg fertilization is especially likely
if both the male and female receive a copulation signal before they engage
in sex (Domjan et al., 2012).
Other experiments indicate that male Japanese quail also learn about the
characteristics of the birds with which they copulate. For example, when a
Learning and Adaptation   59

Figure 2.10  A male Japanese quail learns 10


Pavlovian
to approach a Pavlovian signal (in this
case, a light) that is associated with access 8

Time spent near light (s)


to a female. (After Domjan et al., 1986.)
6

4
Unpaired

0
0 5 10 15 20 25
Trials

brown male quail copulates with a buff-colored female, the male learns to
strongly prefer them in the sense that they approach buff-colored females
more often than
Au: Y-axis non-buff-colored
label females (Nash & Domjan, 1991). In this
changed. OK as set?
case, the birds learn to associate the features of the female with copulation
(Crawford & Domjan, 1993; Domjan, Akins, & Vandergriff, 1992).

Fear
In other situations, animals are confronted with signals for danger. In the
laboratory, a signal may be paired with a brief but painful electric shock.
Because of conditioning, the signal comes to arouse a complex set of behav-
ioral and physiological responses that we know as “fear.” The constellation
of responses includes a number of changes in heart rate, blood pressure,
and respiration that prepare the animal to defend itself (see Hollis, 1982).
In rodents, one of the main responses to a cue associated with elec-
tric shock is “freezing ”—the animal stops in its tracks (e.g., Blanchard &
Blanchard, 1969; Bouton & Bolles, 1980; see Chapter 10). Although other be-
haviors, like running away, are available to frightened rats, freezing appears
to be the dominant response (Fanselow & Lester, 1988). Rodents freeze when
they encounter natural predators (e.g., weasels, cats, and gopher snakes),
and being very still has a definite payoff; for example, mice that freeze in
the presence of a predator are less likely to be attacked and killed than those
that do not (Hirsch & Bolles, 1980). Freezing may reduce predation because
predators tend to respond to movement; alternatively, it may cause a preda-
tor to pay less attention to the prey or may remove releasers for predation
(Suarez & Gallup, 1981). Freezing is an adaptive response to danger signals
because it helps the animal prepare for a dangerous encounter.
The fear system elicited by danger signals has other adaptive compo-
nents. In addition to the effects just described, a signal for electric shock
induces a state of analgesia. That is, the frightened rat becomes less sensitive
to pain (see Fanselow, 1985, for a review). Often, the analgesia is caused
by the release of endorphins—natural opiates in the body that can deaden
60  Chapter 2

pain. Endorphins are released when a rat is exposed to a signal associated


with shock (e.g., Fanselow & Baackes, 1982), when the rat is confronted by
a cat (Lester & Fanselow, 1985), or when the rat is exposed to the odor of
other rats that have been stressed by exposure to shock (Fanselow, 1985).
Bolles and Fanselow (1980; see Chapter 10) noted that defense against a
predator would be compromised were a wounded animal to feel pain and
consequently limp or lick a wound. Classically conditioned signals for shock
allow the animal to prepare for a defensive encounter because they evoke
a whole constellation of adaptive physiological and behavioral responses.

Conditioning with drugs as the outcome


Classical conditioning can readily occur with drugs as Os. This is because
whenever a drug is ingested, there is an opportunity to associate it (O) with
cues (Ss) that are present at the time. (Drugs are also powerful reinforcers
in the instrumental conditioning sense; animals will learn to press levers
to get many of the drugs that are abused by humans [see, e.g., Young &
Herling, 1986].) In humans, the places where drugs are taken, the people
with whom drugs are taken, or the stimuli involved in drug-taking rituals
may presumably be learned as signals for the drug. Classical conditioning
can occur anywhere, and the conditioned response is once again adaptive.
Shepard Siegel discovered an important feature of the response ac-
quired in such situations (see Siegel, 1989, for an especially good review).
In Pavlov’s famous dog experiment, the response (salivation) learned to
the signal (ringing of a bell) was the same as the one elicited by O (also
salivation). With drug Os, however, the response to the signal often looks
very different from the one elicited by the O. Often it is the opposite, and
it functions to cancel the upcoming effect of the drug. This makes adap-
tive sense because a dose of a drug can cause an organism to be out of
Shepard Siegel
balance. The response functions (again) to preserve equilibrium or prepare
the organism for the upcoming stimulus. In conditioning with drugs, one
often observes the acquisition of a conditioned compensatory response.
Consider morphine, an opiate that is related to other abused drugs like
heroin. Morphine is often given to patients to reduce their pain; thus, one
effect of the drug is analgesia. This effect typically habituates, however; the
analgesic effect decreases a little with each exposure to the drug (Figure
2.11A). The decrease in the drug effect is called drug tolerance. Siegel
(1975) proposed that this tolerance might result from classical condition-
ing. Each time the drug is taken, it could become more strongly associated
with environmental cues. If the resulting response to the environmental
cues were compensatory (Figure 2.11B), it would subtract from the drug’s
simple effect. As conditioning increased over trials, the response would
begin to reduce the drug’s effect.
There is now an impressive amount of evidence for the role of con-
ditioning in drug tolerance. For example, if tolerance is really due to a
response elicited by signals for the drug, it should disappear if the drug
is tested without those cues. Experiments have confirmed that tolerance
Learning and Adaptation   61

(A) (B)
Observed effect of drug

Strength of response
Direct drug
Drug tolerance effect (a)

Conditioned
compensatory
response (b)

Trials Trials

Figure 2.11  The development of drug tolerance (A) can be due to the con-
ditioning of a compensatory response (B). The learned response becomes
stronger after each drug exposure. The observed effect of the drug is its direct
effect (a) minus the conditioned response (b). Pavlovian learning is always pos-
sible when we are exposed to biologically significant events (Os).

to several drugs might work this way. In one


of the earliest experiments with morphine
(Siegel, 1975), one group of rats received Test
morphine injections in one room (Room A), 30
while another group of rats received the same
injections in a different room (Room B). The 25
drug’s analgesic effect was tested by putting
Paw-lick latency (s)

a subject rat on a “hot plate,” a plate that is 20


heated to an uncomfortable, but not damag-
ing, temperature. In a few seconds, the rat 15
usually lifted and licked its paws, and if it
took a long time to do so, it was probably 10
not feeling pain. After tolerance had devel-
oped in Siegel’s experiment, both groups 5
were tested with the drug in Room A. The
two groups had received the same number 0
of previous injections of the drug, but one Trials Same Different
Room
of the groups now received it in Room A for
the first time. As you can see in Figure 2.12,
this group showed little tolerance; it was as Figure 2.12  Tolerance to morphine’s anal-
gesic effect develops quickly over trials (left).
if they were receiving the drug for the first
Tolerance is present as long as morphine
time. A similar loss of tolerance with a change is tested in the presence of room cues that
of contexts has been reported with alcohol have been associated with the drug (Same),
(e.g., Crowell, Hinson, & Siegel, 1981; Mans- but it is lost if the drug is tested in the pres-
field & Cunningham, 1980), barbiturates (e.g., ence of different cues (Different). Such find-
Hinson, Poulos, & Cappell, 1982), amphet- ings have important implications for under-
amines (Poulos, Wilkinson, & Cappell, 1981), standing drug overdose. (After Siegel, 1975.)
62  Chapter 2

and at least one benzodiazepine tranquilizer (midazolam) (King, Bouton,


& Musty, 1987). In addition, I will introduce in the next few chapters far
more subtle conditioning effects that have also been reported in morphine
tolerance (Siegel, 1989).
It is sometimes possible to measure the compensatory response directly.
If morphine’s effect is to reduce pain, perhaps the learned response that
develops in the presence of the drug cues is the opposite: an increase in
sensitivity to pain. Siegel (1975) demonstrated exactly that. Rats received
morphine injections in either Room A or Room B. Then they were tested
without the drug in Room A. (Siegel substituted an inert injection of saline
for the drug.) When no drug was expected in Room A (because it had previ-
ously been delivered in Room B), the latency to lift the paw was about 9.1
seconds. When the drug was expected in Room A, however, the latency to
respond was significantly shorter (4.4 seconds). In the room associated with
morphine, the rat appears to be more sensitive to pain than in the other room.
The learning of a compensatory response is clearly adaptive. For ex-
ample, tolerance to a drug can protect an organism from possible overdose.
This fact has clear implications for drug abuse. If tolerance to drugs occurs
as a result of compensatory response conditioning, an abuser will be espe-
cially vulnerable to overdose in a new context. Some survivors of heroin
overdose have actually reported that when they overdosed, they had taken
the drug under novel circumstances (Siegel, 1984; see also Siegel & Ramos,
2002). Data from self-reports are not entirely reliable, however, because
a subject’s memory might be faulty or because he or she might not have
noticed other details that could have contributed to the overdose. The most
compelling evidence comes from experiments with rats (Siegel, Hinson,
Krank, & McCully, 1982; see also Melchior, 1990; Vila, 1989). Siegel et al.
(1982) gave two groups of rats 15 doses of heroin in one room. A control
group received none. At the end of the experiment, all rats were given a
large dose of heroin. A very large percentage of the control subjects—for
whom it was the first exposure to heroin—died (96%). A smaller percent-
age (32%) of drug-exposed subjects, who were tested in the heroin room,
died (another example of tolerance). Most important, tolerant rats given
the lethal dose in an environment that was different from the one in which
they had received the previous injections were about twice as likely to die
(64% died) as the animals injected in the usual environment. Outside of
the Pavlovian signal of the drug, the rats lost the protection provided by
the compensatory response.
Siegel and others have gone on to argue that conditioning mechanisms
like the ones described above may lead to drug addiction if a person re-
peatedly takes the drug (see also Chapter 9). For example, many of the
compensatory responses that can be triggered by cues associated with
opiates (like increased pain sensitivity, hyperactivity, and increased body
temperature) are unpleasant. Abusers may therefore eventually learn to
reduce or escape these responses by taking the drug (e.g., Siegel, 1977).
(This reduction would negatively reinforce drug-taking behavior.) It seems
Learning and Adaptation   63

a little strange to think that something as adaptive as the conditioned com-


pensatory response could play a role in causing something as maladaptive
as drug addiction. We find the same paradox when we consider the causes
of eating and overeating, where normal psychological and biological pro-
cesses can actually lead to obesity. We can resolve the paradox by noting
that our bodies evolved in environments in which food and drugs were
relatively scarce. Our psychological and biological systems did not evolve
to deal with their overabundance in the current environment. Under nor-
mal circumstances, conditioning is fundamentally adaptive in the sense
that it gets us ready to deal with biologically significant outcomes.

Sign tracking
It is possible to describe behavior that results from classical conditioning in
a much more general way. Broadly speaking, classical conditioning affects
behavior in a manner that complements the law of effect in instrumental
conditioning (see Figure 2.6). We can once again classify outcomes as good
or bad according to survival value. Figure 2.13 describes the four possible
relations between an S and a good or bad O. It is similar to Figure 2.6 for in-
strumental conditioning except that we are now concerned with Ss instead
of Rs. Another difference is that the term predict replaces produce because
the S here does not necessarily cause O the way that the R (in Figure 2.6)
does in instrumental conditioning. Both Ss and Rs can nonetheless predict
the occurrence or nonoccurrence of an O.
When a signal predicts a positive outcome, animals often begin to ap-
proach the signal. This tendency is known as sign tracking. The term was
originally used to describe the fact that pigeons that receive a brief illumina-
tion of a plastic disk just before presentation of food will learn to approach
and peck the disk when it is illuminated (Brown & Jenkins, 1968; Hearst &
Jenkins, 1974). But all sorts of animals will tend to ap-
proach signals for all sorts of good Os; for example, hon-
eybees will approach visual cues and odors associated Good Bad
Outcome Outcome
with sucrose (see Bitterman, 1988, 1996, for reviews),
and rats will tend to approach cues associated with food
S predicts
outcome

(e.g., Karpicke, Christoph, Peterson, & Hearst, 1977). Withdraw


Approach S
Chicks will also tend to approach and snuggle with cues from S
for warmth (Wasserman, 1973), and let us not forget
the male Japanese quail (see Figure 2.10). In fact, you
no outcome

can think of the rat’s tendency to learn to “like” flavors


S predicts

Withdraw
associated with positive Os such as calories (discussed from S
Approach S
above) as a kind of approach response; it can also be
considered a type of sign tracking.
When signals instead predict negative outcomes, Figure 2.13  Sign tracking in Pav-
animals tend to withdraw from them. Rats will stay lovian learning. Inside each cell is
away from cues and places associated with an elec- the type of behavior that results
tric shock (see, e.g., Karpicke, Christoph, Peterson, & from each different combination
Hearst, 1977; Odling-Smee, 1975), and presumably so of S and O.
64  Chapter 2

will humans. Taste aversion learning is also an example of the same general
response tendency: Association of a flavor with illness makes the animal
withdraw from (and reject) the flavor. This general tendency is sometimes
known as negative sign tracking. The term describes the behavior (with-
drawal) rather than the nature of the O.
Interestingly, if a signal predicts a decrease in the probability of a bad O,
animals will tend to approach it. Rats will approach a location that predicts
freedom from shock (Leclerc & Reberg, 1980). In a similar way, rats will
tend to approach and like flavors that have been associated with feeling
better, the so-called medicine effect (Zahorik & Maier, 1969).
The final cell in Figure 2.13 describes a situation in which a signal
predicts a decrease in the probability of a positive O. One way to think of
this sort of cue is that it has frustrating effects: A good O that is otherwise
likely to happen is now less likely to occur. Somewhat understandably,
animals will tend to withdraw from this sort of signal. Pigeons will move
away from an illuminated disk if it predicts a decrease in the probability
of food (e.g., Hearst & Franklin, 1977). Rats will also stay away from
boxes associated with the absence of expected food (e.g., Daly, 1969) or a
flavor that signals that a drink will contain fewer calories than otherwise
expected (e.g., Boakes, Colagiuri, & Mahon, 2010; Boakes, Patterson, &
Kwok, 2012).
As we have already seen, Pavlovian signals have many effects in addi-
tion to these gross approach and withdrawal tendencies. Notice, though,
that these general tendencies, like the law of effect in instrumental con-
ditioning, have a nice function. They may help ensure that animals con-
tinue to make contact with positive Os and stay away from bad ones. In
a sense, signals that animals learn about through classical conditioning
guide behavior in a way that is highly consistent with the response rule
represented in instrumental conditioning’s law of effect. Once again, the
effect of classical conditioning (stimulus learning) is to help the organism
adapt to the environment.

Other Parallels Between Signal and Response Learning


Given the similar functions of instrumental and classical conditioning, it
is not surprising that they are each sensitive to very similar variations in
O. It is widely believed that both types of learning follow similar general
rules or laws (e.g., Dickinson, 1980; Mackintosh, 1983; Rescorla, 1987). We
will examine this issue in more detail in Chapter 10. For now, I will describe
four important parallels between instrumental and classical conditioning
that flow directly from the idea that both are designed to optimize the
animal’s interactions with important events or outcomes.

Extinction
In either instrumental or classical conditioning, responding remains high
as long as R or S continues to produce or predict O. If the outcome is
Learning and Adaptation   65

(A) Instrumental conditioning (B) Classical conditioning


45
Acquisition Extinction Acquisition Extinction
Mean lever-press responses per minute

40
1.00
35

Proportion of behavior
30
0.75
25

20
0.50
15

10 0.25
5

0 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7
Sessions Sessions

Figure 2.14  Extinction occurs in either instrumental or classical conditioning


when O is no longer delivered after the R or S. (A) After an instrumental (oper-
ant) lever-press response is learned (acquisition), rats stop pressing the lever
when reward is withheld (extinction). (B) When an auditory cue is paired with
food, rats show an excited “head jerk” response (acquisition), which then de-
clines when food is no longer presented (extinction). (A, after Nakajima, Tanaka,
Urushihara, & Imada, 2000; B, after Bouton & Peck, 1989.)

dropped from the situation, though, responding will decline. Pavlov first
noticed that once the dog was salivating to the sound of the bell, the sali-
vation response declined if the bell was rung several times without food.
In instrumental conditioning, a rat that has been trained to press a lever
for a food reward will stop pressing if the reward is no longer presented.
Both the procedure of withholding O after conditioning and the decline
in responding that results from that procedure are called extinction. Fig-
ure 2.14 shows examples of extinction in both instrumental and classical
conditioning.
Extinction is extremely important in conditioning and learning; you
find it almost as often as you find acquisition itself. (Learning that results
from the pairings of R or S with O is often called acquisition to contrast it
with extinction.) Extinction is intimately involved in shaping: When rein-
forcers are applied for specific behaviors, behaviors that are not reinforced
are eliminated. In fact, extinction is a crucial process in behavior change.
Thanks to extinction, an animal will stop foraging in a particular location
or patch if the payoff for foraging drops to zero. Extinction allows behavior
to continue to adapt to a changing environment.
Extinction is an important tool in the clinic because clinical psycholo-
gists are often interested in eliminating behaviors that cause problems for
66  Chapter 2

the client. For example, a child may be referred to a psychologist because


the child keeps acting out and disrupting his or her class. One treatment
strategy would be to identify the events that reinforce this behavior (e.g.,
attention or peer approval) and then eliminate them. By withholding the
reinforcers that maintain it, the behavior will theoretically decline. Extinc-
tion is also often the method of choice in treating classically conditioned
phobias and anxiety disorders. A client’s fear of insects, for example, can be
treated by exposing the client over and over to the feared stimulus without
an aversive consequence (e.g., Stampfl & Levis, 1967; Wolpe, 1958). Such
exposure therapy is one of the most effective treatments used in reduc-
ing learned fears (e.g., Barlow, 2002; Craske, Treanor, Conway, Zbozinek,
& Vervliet, 2014).
It is tempting to conclude that extinction involves destruction of the
original learning, but it is fairly easy to show that a great deal of the origi-
nal learning actually remains after extinction (Bouton, 1991, 1994a, 2004).
One example was provided long ago by Pavlov. If extinction occurs on one
day and then we wait some period of time before the signal is presented
again, the response often recovers to some extent. This effect is known
as spontaneous recovery (after extinction): An extinguished response
can recover with the passage of time. Spontaneous recovery happens in
both instrumental and classical conditioning. In both cases, withholding
O causes a decrease in performance, but this does not necessarily reflect a
loss of what was originally learned. What an animal does is not always a
direct reflection of what it “knows”: It is always important to distinguish
between performance and learning. This distinction will come up in many
ways and in many different chapters.
You may remember that “spontaneous recovery” also occurs when time
elapses after habituation. Extinction in classical conditioning is similar to
habituation (see Figure 2.3) in that both effects involve a decline in behavior
when a stimulus is repeatedly presented. The two processes are different,
however. The most important difference between them is that extinction
is a decline in a learned behavior that occurs when the usual outcome (O)
is omitted, whereas the behavior in habituation is not learned. Extinction
is also different from forgetting. When something is forgotten, it is lost be-
cause of the simple passage of time. In extinction, behavior is lost because
of direct experience with R or S now disconnected from O. Forgetting and
extinction will be discussed in more detail in Chapter 5.

Timing of the outcome


Because both instrumental and classical conditioning allow the animal
to adapt to an upcoming event or outcome, it is not surprising that both
types of learning depend on when the critical events (R and O or S and
O) occur. As a general rule, the closer two events occur together, the more
likely the animal will behave as if the two events were associated. Learning
is best when the events are contiguous, a fact appreciated by the British
Empiricists such as Locke and Hume (see Chapter 1).
Learning and Adaptation   67

(A) Reward (B) Punishment


0.6

210
Control
0.5
180

Immediate
150 reward 0.4 5-second
delay
Start speed (cm/s)

Suppression ratio
120
0.3

90 Immediate
0.2 punishment
10-second delay
60

0.1
30

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 11
Blocks of 2 days Daily session

Figure 2.15  Reward (A) and punishment (B) are most effective when O is
presented immediately after the response rather than delayed. In (B) control
received no punishment. (After Capaldi, 1978; Church, 1969.)

Instrumental conditioning works best when the behavior is followed


immediately by O. People experienced with the shaping of behaviors in
either animals or humans know that a delay of a few seconds can ruin the
effects of a reinforcer. We are always engaging in a stream of different activi-
ties. If the reinforcer is delivered even a few seconds after a target response,
it is likely that some other behavior will be reinforced instead. Timing of O
is also very important in punishment. For best results, the punisher must
be applied quickly. Figure 2.15 illustrates the effect on performance of in-
troducing a delay between the response and the reinforcers and punishers.
Timing is also important in classical conditioning. As a general rule,
the animal will respond more to S if S has preceded O by only a short
delay. One of the most interesting facts about classical conditioning is that
although this rule is always true, the intervals between S and O that allow
learning differ quite a bit for different instances of conditioning. For ex-
ample, in fear conditioning, a rat will learn to be afraid of a tone even if the
tone goes off tens of seconds before an electric footshock (O) is presented.
Intervals that are much longer prevent much learning from occurring. In
contrast, when rats are learning taste aversions, a delay of several hours
between the flavor (S) and getting sick (O) still allows learning to occur.
68  Chapter 2

The discrepancy is consistent with the view that taste aversion learning is
adapted to allow animals to learn about slow-acting poisons. I will examine
some implications of this view in Chapter 6. For now, it is important to
notice that even in taste aversion learning, conditioning is still better the
closer that S and O occur together in time (Figure 2.16).
One way to think of the effects of the timing of O on learning is that
learning is designed to uncover the probable “causes” of O (e.g., Dickin-
son, 1980). Instrumental conditioning works so that animals will repeat

(A) Fear conditioning in rats (B) Autoshaping in pigeons


40
700

600
Fear (latency to lick) (s)

Reponses per minute


30
500

400
20
300

200 10
100

0 0
0 10 30 No conditioning 0 4 12 36 120
Trace interval (s) Trace interval (s)

(C) Taste aversion in rats Figure 2.16  Classical conditioning is bet-


ter when the interval between S and O
100
is minimal. (A) Fear conditioning in rats,
Preference for saccharin (%)

Control where a tone was paired with an electric


80
shock that was delayed by 0, 10, or 30 sec-
onds. (B) Autoshaping in pigeons, where
60 Experimental
pigeons were presented an illuminated
disk for 12 seconds that was then followed
40 by food after various intervals. Note the
log scale on the x-axis. (C) Flavor aversion
20 learning in which rats received a drink of
saccharin either alone (Control) or followed
0 by X-irradiation (which made them sick)
1 3 6 12 24 after various delays. In the test, rats were
Trace interval (h)
given a choice between saccharin and
water. Notice that the intervals permitting
learning depend on the example of condi-
tioning: Taste aversion learning still works
well when hours intervene between S and
O. (A, after Marlin, 1981; B, after Lucas,
Deich, & Wasserman, 1981; C, after Smith
& Roll, 1967.)
Learning and Adaptation   69

behaviors that cause good Os (reward learning) and stop doing behaviors
that cause bad ones (punishment); an act that precedes an O by a relatively
brief time interval is more likely to have caused O than an act that occurred
more remotely in time. A similar argument works in classical conditioning.
The rat learns to be afraid of a tone that is a plausible signal (or cause) of a
footshock. It might also reject a flavor that has preceded illness by an hour
or two; once again, timing is important, but here it is likely that a food
consumed an hour or two ago contained a poison that is the cause of the
present illness. All types of learning are sensitive to the interval between
R or S and O because causes of O tend to happen recently. Notice that we
are merely thinking about what learning is designed to do; it is not neces-
sary to believe that the animal is consciously or actively seeking causes. In
learning, timing of O is nevertheless (almost) everything.

Size of the outcome


Behavior resulting from instrumental and classical conditioning is also
strongly affected by the size or value of O. In general, the bigger (or longer
or more intense) the O, the more powerful its impact on behavior. This is a
direct effect of the fact that the function of either instrumental or classical
conditioning is to optimize interactions with O. In other words, the bigger
the outcome, the more of it there is to interact with.
With few exceptions, larger positive Os lead to stronger overall behav-
ior. For example, rats will run down alleyways at faster and faster speeds
when larger and larger rewards await them at the end (e.g., Bower, 1961;
Crespi, 1942; Figure 2.17). These sorts of results are consistent with the gen-

(A) Reward (B) Punishment

100 0.6 Control

0.5
80
Mean running speed

Suppression ratio

0.4
60 Milliamperes
0.3 0.1 0.5
8 pellets 0.3 2.0
40
1 pellet 0.2

20 0.1

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0 1 2 3 4 5 6 7 8 9 10
Block of 2 days Session

Figure 2.17  Bigger Os cause better response learning. (A) Reward learning in
rats in a runway. (B) Punishment of lever pressing in rats with different intensities
of electric shock. (A, after Bower, 1961; B, after Camp et al., 1967.)
70  Chapter 2

(A) Food Outcome (B) Shock Outcome

0.7 No. pellets 1.00 Milliamperes


1 0.5
0.6 2 0.35
Proportion of behavior

Proportion of behavior
0.80 0.25
4
0.5 8 0.1
0.4 0.60

0.3 0.40
0.2
0.20
0.1

0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6
Session Session

Figure 2.18  Bigger Os cause better stimulus learning, too. (A) Classical condi-
tioning with a food O. When an auditory cue signals food, rats show an excited
“head jerk” response. (B) Classical conditioning with an electric footshock O.
Rats freeze in the presence of a cue signaling an electric shock. (A, after Morris
& Bouton, 2006; B, after Holland, 1979.)

eral belief that people will work harder and more productively for larger
rewards. Similarly, the stronger an aversive O is, the stronger its punishing
effect will be (e.g., Azrin, 1960; Camp, Raymond, & Church, 1967). As a first
approximation, your intuitions about how the size of O affects behavior are
accurate: Instrumental action is directly affected by the size of O.
Consistent with the theme of this chapter, the bigger the outcome, the
more responding one also observes in classical conditioning. Larger posi-
tive Os produce stronger appetitive behaviors (e.g., Holland, 1979; Morris
& Bouton, 2006); similarly, larger negative Os produce stronger fears (e.g.,
Annau & Kamin, 1961; Holland, 1979; Morris & Bouton, 2006; Figure 2.18).
The size of O has potent effects on behavior that results from both response
and stimulus learning.

Preparedness
Signal and response learning are also alike in that some combinations of
events are learned more readily than others. Animals behave as if evolution
has “prepared” them to associate certain events or stimuli.
The idea of preparedness owes itself most importantly to the pioneer-
ing work of John Garcia. In one of the most important papers published on
classical conditioning, Garcia and Koelling (1966) examined the rat’s ability
to associate two types of Ss (a taste stimulus or an audiovisual stimulus)
with two types of Os (an illness or an electric shock). In the conditioning
trials, rats drank a flavored solution from a water bottle; at the same time,
every time their tongue lapped at the drinking spout, it caused a click
Learning and Adaptation   71

(A) (B)
Outcome Taste
Illness Shock 3.5
Bright-noisy water

3.0
Taste

Mean licks per minute (× 100)


2.5
Stimulus

2.0
Bright-noisy
water

1.5

1.0

0.5

0
Illness Shock

Figure 2.19  Preparedness in classical conditioning. (A) Experimental design.


(B) Tests of taste and bright-noisy water after either had been paired with ill-
ness or with an electric shock. Well-conditioned cues should have suppressed
licking. Thus, taste was a better cue for illness than for an electric shock, and
bright-noisy water was a better cue for an electric shock than for illness. (After
Garcia & Koelling, 1966.)

and a flash of light (Figure 2.19). (The audiovisual stimulus is known as


“bright-noisy water.”) Rats in different groups then received either an elec-
tric footshock or nausea after exposure to this complex stimulus. Finally,
the rats were tested with either the taste or the bright-noisy water alone.
The results were striking. The rats that were made ill during condition-
ing showed a strong rejection of taste, but little rejection of bright-noisy
water. Conversely, the rats that received an electric footshock during con-
ditioning showed the opposite pattern: They rejected bright-noisy water
but not taste. Similar results have been obtained when taste and auditory
cues were given to separate groups during conditioning (Domjan & Wilson,
1972). It is not possible to claim that the two Ss differed in salience; each
was associated better than the other with one of the Os. Similarly, it is not
possible to say that the outcomes differed in their salience; each was associ-
ated better than the other with one of the Ss. What matters is the specific
combination of S and O. When sick, the rat behaves as if it thinks “It must
have been something I ate” (see Chapter 1).
As mentioned in the last chapter, the rat’s tendency to associate tastes
with illness (and external cues with shock) appears to be built in at birth.
In an ingenious experiment, Gemberling and Domjan (1982) showed es-
sentially the same pattern of results in rats that received conditioning when
72  Chapter 2

they were only one day old. Rats of this age have both their ears and eyes
closed, so it was not possible to use audiovisual cues. Gemberling and
Domjan (1982) therefore substituted texture (a rough cloth or the slippery
interior of a milk carton) for the bright-noisy water. Compatible results
were obtained; preparedness is evident before much experience with vari-
ous kinds of stimuli can have an effect. The idea that the associative predi-
lection is inborn is also supported by the fact that the bobwhite quail—a
bird that feeds during the daytime and swallows seeds whole— appears to
associate illness with visual cues more readily than with taste (Wilcoxon,
Dragoin, & Kral, 1971).
The rat’s tendency to associate illness with taste seems to make sense
from an evolutionary perspective. Rats are omnivores whose diet varies
quite a lot; they feed opportunistically and mostly at night. Foods can
usually be identified by flavor, and the rat must be able to learn about and
identify the foods that contain slow-acting poisons. Given this scenario,
rats that were able to avoid flavors associated with poisons would have
a selective advantage over rats that could not make the same association
(e.g., Rozin & Kalat, 1971).
Preparedness also appears to influence instrumental conditioning,
where the reward sometimes fails to increase the desired behavior. Keller
Breland and Marian Breland made a career of shaping animals to do inter-
esting things for television cameras (among other things). Keller had been a
student of B. F. Skinner. However, in a classic paper, the Brelands (Breland
& Breland, 1961) documented some entertaining difficulties that they had
encountered while training animals to do certain things. For example, for
a bank commercial, they once tried to train a pig to put a wooden coin in
a piggy bank. Unfortunately, despite intensive and heroic efforts, the pig
never learned to put the coin in the bank; instead, it began rooting the coin
around the floor of the pigpen with its snout. They noted that rooting is one
of the pig’s natural foraging behaviors. (I will return to intrusions of this
sort of “misbehavior” in Chapter 10.) In a similar vein, Bolles (e.g., 1972b)
noted that although it is easy to train a rat to press a lever for food, it is
much more difficult to teach it to perform the same response to avoid shock.
Stevenson-Hinde (1973) showed that young male chaffinches learned a
pecking response more readily when they got food as a reward than when
the reward was a tape recording of an adult chaffinch singing. (Adult song
was effective at reinforcing the birds to perch on a twig.)
Sara Shettleworth performed some of the most systematic research on
preparedness in a series of studies with golden hamsters (e.g., Shettleworth,
1975, 1978; Shettleworth & Juergensen, 1980). When each of several natural
behaviors was followed by a food reward, dramatically different results
were obtained. For example, rearing in the middle of the test arena, rearing
while moving the paws rapidly at the walls (“scrabbling”), and digging
in the sawdust on the floor all increased when paired with a food reward,
but face-washing, scratching, and scent-marking behaviors did not (Figure
Sara Shettleworth 2.20). The same behaviors also differed in how easily they were punished
Learning and Adaptation   73

Figure 2.20  Preparedness


in instrumental condition- Scrabble
ing. Some of the hamster’s
natural behaviors are easier 500
to associate with food than

Mean seconds per 1200-second session


are other behaviors. (After Dig
Shettleworth, 1975.)
400

Rear
300

200

100 Wash face


Scratch
Mark
0
1 2 3 4 5 6 7 8 9 1 2
Reinforcement Extinction
Sessions

by an electric shock (Shettleworth, 1978), although the degree to which a


behavior was suppressed by punishment was not perfectly predicted by
the degree to which it was strengthened by food reward. In principle, dif-
ferent R-O combinations may fail to produce learning for any of a variety
of reasons (see Domjan, 1983, for a review), but the available evidence
suggests that certain combinations of Rs and Os are learned more readily
than others.
Does preparedness exist in humans? Yes, probably. People who have
survived the tequila ritual mentioned earlier tend to associate the flavor of
tequila—and not the friends or the shot glass that were also present—with
alcohol poisoning. (“It must have been something I drank.”) As is the case
with most anecdotal evidence, though, other interpretations are also pos-
sible. For example, the flavor of tequila may merely be the most novel and
distinctive of the available cues. Other data, however, do suggest prepared-
ness in humans. Seligman (1971) noted that snake and spider phobias are
especially prevalent; this may suggest preparedness in human fear-learning
because snakes and spiders are presumably no more likely to be paired with
pain than are knives or electrical outlets. In laboratory experiments, people
associate an electric shock more readily with fear-relevant stimuli (images
of snakes, spiders, or angry faces) than with fear-irrelevant stimuli (flowers,
mushrooms, or happy faces) (Öhman, Dimberg, & Ost, 1985; see Öhman &
Mineka, 2001, for one review). (Interestingly, monkeys also associate fearful
74  Chapter 2

experiences, excited by the sight of another monkey acting fearfully, with


snake or crocodile models but not with flowers or toy bunnies [Cook &
Mineka, 1989, 1990].) People with a strong fear of snakes or spiders also
overestimate their correlation with shock in experiments in which shocks are
equally associated with relevant and irrelevant stimuli (Tomarken, Mineka,
& Cook, 1989). Susan Mineka (1992) has suggested that these tendencies may
reflect the influence of “evolutionary memories” that, along with other cog-
nitive biases shown by fearful subjects, have an important effect on human
emotional disorders. Not all combinations of Ss and Os or Rs and Os are
equally learnable. It is useful to view human learning and behavior—like
that of other animals—within its evolutionary context.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. Behaviors may be selected by evolution if they have survival value. Such
behaviors are innate in the sense that they have no obvious original
basis in learning.
2. Even innate behaviors can be modified with experience. In habituation,
for example, responding that is elicited by a stimulus can decline if the
stimulus is presented repeatedly. Habituation is a simple form of learn-
ing that occurs with many behaviors and stimuli, including behaviors
elicited or controlled by food.
3. Evolution is a process that allows adaptation between generations;
learning, on the other hand, is a process that allows adaptation within
an animal’s lifetime.
4. Through instrumental conditioning, animals learn to increase their
contact with good outcomes (Os with positive survival value) and
decrease their contact with bad outcomes (Os with negative survival
value). The law of effect describes this state of affairs. Behaviors increase
if they produce good Os (reward learning) or prevent bad Os (avoidance
or escape learning). They decrease if they produce bad Os (punishment)
or prevent good Os (omission). The term reinforcement means strength-
ening; it is used to describe either reward learning (positive reinforce-
ment) or avoidance and escape learning (negative reinforcement).
5. Shaping allows new behaviors to be added to an animal’s repertoire. In
shaping, new behaviors emerge because successive approximations of
the behavior are differentially reinforced. Shaping does not necessarily
require a teacher, and it resembles natural selection in the sense that
behavior is selected by its consequences.
Learning and Adaptation   75

6. In classical conditioning, animals learn to respond to signals for O. The


response is adaptive because it helps the animal optimize its interac-
tion with the upcoming O. Signals for food evoke responses that help
the animal digest the meal and identify good and bad food sources.
Signals for rivals and for mates evoke responses that prepare the animal
for a fight or a sexual encounter. Danger signals evoke a constellation of
physiological and behavioral responses that allow the animal to defend
itself against the impending aversive O. Conditioning with drug Os al-
lows the learning of adaptive responses, like the compensatory condi-
tioned response, that help an animal maintain equilibrium.
7. Through classical conditioning, animals learn to approach signals for
good Os as well as those that predict the absence of bad ones; they
also withdraw from signals that predict the presence of bad Os or the
absence of good Os. This tendency is called sign tracking, and it com-
plements the law of effect in helping animals increase or decrease their
contact with events that have either positive or negative survival value.
8. Extinction occurs in both instrumental and classical conditioning. It is a
decline in responding that occurs when O no longer follows the signal
or behavior that previously predicted it. Extinction allows the animal to
continue to adapt as the environment changes. It is also useful in reduc-
ing unwanted behaviors in the clinic.
9. Classical and instrumental conditioning are both sensitive to the timing
and the magnitude of O. Learning is best when O follows the signal or
the behavior quickly and when O is large or intense.
10. Evolution may prepare animals to associate some events more read-
ily than others. Such “preparedness” is evident in both classical and
instrumental conditioning. This phenomenon was discovered in aversion
learning experiments: Taste is a good signal for illness but a bad one for
an electric shock, whereas audiovisual cues are bad signals for illness
but good cues for shock.

Discussion Questions
1. Discuss methods that ethologists have used to test hypotheses about
the evolution of behavior. Why is it difficult to claim that a behavior is
“innate?”
2. How do we know that the decrease in responding that occurs in habitu-
ation is not merely a result of fatigue? How is habituation functional?
3. Using several examples, discuss how classical conditioning and instru-
mental learning help us adapt to the environment.
4. When psychologists began to study classical conditioning with drug
outcomes, they discovered the conditioned compensatory response.
What is this, and how does it explain the development of tolerance to
76  Chapter 2

drugs? What are the implications for understanding human drug abuse
and addiction?
5. Using what you know about habituation, stimulus learning, and re-
sponse learning, discuss some of the reasons humans might develop a
taste for junk food, seek out restaurants that serve it, and overeat.

Key Terms
acquisition  65 fixed action pattern  44 releaser  44
artificial selection  46 habituation  47 reward learning  51
avoidance  51 law of effect  52 sensitization  49
conditioned natural selection  42 shaping  52
compensatory negative sign stimuli  44
response  60 reinforcement  52 sign tracking  63
dishabituation  48 negative sign spontaneous recovery
drug tolerance  60 tracking  64 (after extinction)  66
dual-process theory of omission  51 spontaneous recovery
habituation  50 positive (after habituation)  48
escape  51 reinforcement  52 taste aversion
ethology  43 preparedness  70 learning  55
exposure therapy  66 punisher  51
extinction  65 punishment  51
fitness  42 reinforcement  52
Chapter Outline
The Basic Conditioning Conditioned Inhibition  97
Experiment 80 How to produce conditioned
Pavlov’s experiment  80 inhibition 97
What is learned in conditioning?  81 How to detect conditioned inhibition  98
Variations on the basic experiment  83 Two methods that do NOT produce true
inhibition 100
Methods for Studying Classical
Conditioning 84 Information Value in
Eyeblink conditioning in rabbits  85 Conditioning 101
Fear conditioning in rats  86 CS-US contingencies in classical
Autoshaping in pigeons  87 conditioning 101
Appetitive conditioning in rats  89 Blocking and unblocking  103
Taste aversion learning  90 Overshadowing 106
Relative validity in conditioning  106
Things That Affect the Strength
of Conditioning  90 Summary 109
Time 91 Discussion Questions  110
Novelty of the CS and the US  93
Intensity of the CS and the US  94 Key Terms  111
Pseudoconditioning and sensitization  95
chapter

1
3
The Nuts and Bolts of
Classical Conditioning

C hapter 2 examined why it makes sense that classical


conditioning and instrumental conditioning should
occur and why they are so important. We also had a
chance to look at many different examples of these
two fundamental forms of learning. One theme was
that they have similar functions. This chapter begins
by asking a different kind of question: How do these
types of learning actually work? That is the question
that has interested most of the researchers who have
gone into the lab to study learning.
This chapter takes an initial look at the mecha-
nisms of classical conditioning. As we get into this type
of learning, it is worth remembering that conditioning
experiments are designed to create a well-controlled
situation that is representative of associative learning
in general. By studying learning in a relatively simple
system, we hope to arrive at principles that may de-
scribe associative learning in general. (I will consider
some challenges to this idea in Chapter 6.) We need
to remember that the modern view is that classical
conditioning and instrumental learning reflect simi-
lar learning processes. In both, animals behave as if
they have learned to associate events—either stimuli
(Ss) or responses (Rs)—with biologically significant
outcomes (Os). It makes sense to first consider clas-
sical conditioning because it is often more straight-
forward than instrumental conditioning to study:
The experimenter can present S and O whenever
he or she wants to. (In instrumental conditioning, R
80  Chapter 3

occurs at the whim of the subject and is not entirely under the experi-
menter’s control.) If we can be precise about presenting Ss and Os, we
may hope to arrive at some fairly precise laws that describe the learn-
ing processes represented in both classical and instrumental conditioning.
Although classical conditioning looks a little simple, it is not very
simple-minded. By the end of this chapter, you may begin to realize that
conditioning is probably not what you think it is (Rescorla, 1988b).

The Basic Conditioning Experiment


Pavlov’s experiment
Pavlov’s famous experiment itself simplified some observations made by
Stefan Wolfsohn, an associate of Pavlov who was working in Pavlov’s
laboratory (Boakes, 1984). Wolfsohn was studying how the dog salivated
in response to a number of different things that were put into its mouth.
Perhaps not surprisingly, the dog salivated in response to things like sand
and pebbles as well as food. Wolfsohn noticed that after sand had been put
in the dog’s mouth a few times, the sight of the sand alone was enough to
cause some salivation. The dog had learned about sand; the dog had as-
sociated the visual features of sand with the fact that it causes salivation.
Most objects consist of a number of features (visual cues, tactile sensations,
etc.) that can be associated when we learn about the object. Object learning
is an example of associative learning.
To study the phenomenon more thoroughly, it was useful to design a
procedure in which the initially neutral cue (the sight of the sand) could be
separated from the stimulus (sand in the mouth) that actually caused the
salivation. Separating the neutral and causal stimuli was the beginning of
the conditioning experiment that we know so well today in which a ring-
ing bell and food are paired on each of a number of conditioning trials.
Pavlov developed the procedure, discovered many of its most impor-
tant effects, and introduced the terms that are still used to describe it. These
terms may seem confusing at first, but they are really fairly simple, and
you need to know them if you want to read and think more about classical
conditioning. The idea was to create a neutral description of the experiment
that could be used to describe any experiment on any example of condition-
ing. Pavlov noticed that the biologically significant stimulus, food, had the
power to elicit salivation unconditionally. That is, drooling to food did not
depend on (i.e., was not conditional on) the dog having been through the
experiment. Thus, the food is known as the unconditional stimulus, or US.
(In early translations of Pavlov’s work, unconditioned was used instead of
unconditional.) Because the drooling response to food was not conditional
on the experiment, drooling to the food was similarly known as the un-
conditional response, or UR. These terms are summarized in Figure 3.1.
The sound of a bell comes to elicit salivation as a result of the condi-
tioning experience; its control over the response is “conditional” on the
conditioning experience. The ringing bell is therefore known as the con-
The Nuts and Bolts of Classical Conditioning   81

Figure 3.1  The crucial events in Pavlov’s famous


Bell — Food CS — US experiment (left) and the terms we now use
to describe them (right). Pavlov invented the
terms, which are still widely used today because
scientists need a consistent language to discuss
Salivate Salivate CR UR
all examples of classical conditioning.

ditional stimulus, or CS. (Once again, early translations of Pavlov used


conditioned stimulus, which is also still used today.) The response to the
sound of a bell itself—also conditional on the conditioning experience—is
known as the conditional response, or CR. Many writers continue to use
the term conditioned instead of conditional. Using that language, the CS is
the “conditioned stimulus,” the CR is the “conditioned response,” and
“conditioned” roughly means “learned.”
Pavlov actually used an interesting variety of cues as conditional stim-
uli: His dogs learned to salivate to things like bubbling noises, whistles,
metronomes, and rotating disks. Early on, Pavlov also established that
conditioning with one CS could often generalize to other similar CSs.
Thus, after conditioning with a tone of a certain pitch, animals will also
respond to other CSs of similar pitch. (Responding to similar cues is called
generalization.) Pavlov actually covered a lot of important and interest-
ing ground in his work, but one of his most lasting contributions is the
vocabulary that is still used to describe conditioning experiments. It is
worth rehearsing the terms US, UR, CS, and CR a little bit before reading
anything in the modern literature on classical conditioning.

What is learned in conditioning?


Explanations of conditioning have usually assumed that the subject learns
an association between two of the events in the experiment. The main con-
troversy has been which two events? When psychologists in the United
States first became interested in conditioning in the early twentieth century,
they assumed that the dog learned to associate the bell with salivation
(the CS with the UR). The bell and drooling to the food occurred together
in time, and because the reflex is all about connections between stimulus
and response, it was natural to assume that there was an S-R association,
or S-R learning. Pavlov himself, however, had actually taken a different
view. He assumed that the dog came to associate the bell and food and that
the bell eventually elicited drooling because it had been associated with
the food (Figure 3.2). For Pavlov, the ringing bell became a substitute for
the food in controlling the drooling reflex; his idea is known today as the
stimulus substitution theory. The crucial association was between two
stimuli: S-S learning.
Research that has followed suggests that the association is most often
S-S, and although both S-S and S-R learning are possible, most experi-
ments suggest that the animal has learned to associate the CS with the US
82  Chapter 3
S-R learning S-S learning
Figure 3.2  Two associations can be
learned in a classical conditioning Bell (CS) Food (US) Bell (CS) Food (US)
experiment. At left, the organism
might associate a stimulus (the CS)
and a response (the drooling UR), Salivate (CR) Salivate (UR) Salivate (CR) Salivate (UR)
known as S-R learning. At right, the
organism might associate a stimulus
(the CS) with another stimulus (the
US), known as S-S learning.

(e.g., see Rescorla, 1978). For example, in one experiment with rats, Robert
Rescorla (1973a) conditioned a fear of a light by pairing the light with the
sound of a klaxon, a very loud stimulus that arouses fear. After fear con-
ditioning was complete, one group of rats received repeated exposure to
the klaxon alone in a second phase (Figure 3.3). The exposure habituated
the rats’ fear of the klaxon; at the end of this phase, the klaxon no longer
frightened the rats. A control group did not receive habituation. At the end
of the experiment, both groups were tested for their fear of the light. The
rats for which the klaxon had been habituated were less afraid of the light
than the rats in the control group.
Why should habituation of the klaxon cause fear of the light to change?
The klaxon was never paired with the light again after fear of it had been

(A)
Group Phase 1 Phase 2 Test
1 L — Klaxon Klaxon, klaxon, ... L?
2 L — Klaxon — L?
Figure 3.3  (A) Design of
Rescorla’s (1973a) experiment
(B)
suggesting that rats associate CS
0.3
and US in Pavlovian conditioning
experiments. In the experiment,
a light (L) was paired with a loud
klaxon in two groups of rats. This
pairing caused the conditioning 0.2
Supression ratio

of fear to the light. In the next


phase, the klaxon was presented
over and over again to Group 1
so that fear of it became habitu-
ated. When the light was tested 0.1
in both groups, it also elicited
less fear in Group 1. (B) Results of
testing. The measure of condi-
tioning is the suppression ratio; 0
less fear is indicated by a higher 1 2
ratio. (After Rescorla, 1973a.) Group
The Nuts and Bolts of Classical Conditioning   83

habituated. Rescorla argued that learning in the first phase must have been
S-S: The rats first learned to associate the light with the klaxon. As a result,
the light aroused fear because it activated a representation of the klaxon in
memory. The habituation phase then taught the rats that the klaxon was not
all that bad after all. During testing, the light activated the modified klaxon
representation, and the rats experienced less fear than before. Similar re-
sults have been produced in several other situations (e.g., Holland, 1990a,
2005; Holland & Rescorla, 1975a; Holland & Straub, 1979; Rescorla, 1974).
Other studies of classical conditioning suggest that the rat’s representation
of the US can include many of its features, including its emotional value,
as above, but also its sensory properties and temporal properties, that is,
when it will occur in time (e.g., Delamater, 2011, 2012; Delamater & Oake-
shott, 2007; Wagner & Brandon, 1989; see Chapters 4 and 9). According to
modern thinking about classical conditioning, conditioning does produce
an association between the CS and the US, although S-R learning does
sometimes occur (e.g., Donahoe & Vegas, 2004; Rizley & Rescorla, 1972;
see Holland, 2008, for a good review).

Variations on the basic experiment


There are two variations on the simple conditioning experiment that fur-
ther allow a CS to elicit a response even though it has never been directly
associated with a US. Once a CS has been fairly well conditioned, the CS
itself can serve as a “US” and support new conditioning. The effect, known
as second-order (or higher-order) conditioning, is illustrated in Figure
3.4. In this example, a light CS is first paired with a food US until the light
elicits a conditioned response quite reliably. Then, in the second phase, a
new CS (a tone, T) is now paired with the light. As a result of these trials,
the tone now elicits a CR, even though it has never been paired directly with
a US. This phenomenon has been shown in several conditioning arrange-
ments (e.g., Rescorla, 1980). It is also probably common outside the lab. For

Second-order conditioning
Phase 1 Phase 2 Test

L — Food T—L T?

Sensory preconditioning
Phase 1 Phase 2 Test

T—L L — Food T?

Figure 3.4  Second-order conditioning and sensory preconditioning. Both are


important because they are ways that organisms can come to respond to a CS
(in this case, a tone, T) that has never been directly paired with the US (food).
84  Chapter 3

example, a child who is bitten by a dog may associate the dog (CS) with
the bite (US) and become afraid of dogs. If the child later encounters dogs
in a local park, he may become afraid when he enters the park—through
its association with dogs.
Figure 3.4 also illustrates a related phenomenon that is easy to confuse
with second-order conditioning. In sensory preconditioning, two neutral
stimuli are first paired and then one is separately associated with a US. In
the figure, a tone and a light are paired in an initial phase. Then the light is
separately associated with food. In a third test phase, the experimenter tests
the subject’s response to the tone. In this arrangement, the tone will elicit a
CR. Like second-order conditioning, sensory preconditioning suggests that
conditioning can occur even when the CS is not paired directly with a US.
In fact, one of the interesting things about sensory preconditioning is that
the target CS (the tone) is also never paired with the response it eventually
evokes. In second-order conditioning, the tone could be associated with a
response elicited by the light, but in sensory preconditioning, the same two
stimuli are paired before the light ever has a reason to elicit the CR. Thus,
sensory preconditioning is often thought to be a case of pure S-S learning
(but see Holland, 2008). It is also probably common in human experience.
For example, you might associate Randy and Trevor after you see them
hanging out together around campus. When you later learn that Trevor
has been arrested for breaking and entering, you might think that Randy
could be guilty of the same thing. This case of “guilt by association” is an
example of sensory preconditioning.
These phenomena provide ways that stimuli can control conditioned
responding without ever being paired directly with a US. Generalization,
second-order conditioning, and sensory preconditioning can expand the
range of stimuli that can affect behavior even after a fairly specific condi-
tioning experience. They are worth keeping in mind when you consider
how behaviors, like some fear responses that might be seen in a clinic, can
arise through simple conditioning (for related discussions, see Mineka,
1985, and Davey, 1992).

Methods for Studying Classical Conditioning


It is useful to think broadly about conditioning so that you get into the
practice of applying laboratory findings to behavior in the real world.
But when scientists investigate the specific details of conditioning, it is
important to have a standard set of methods that can be used by differ-
ent investigators working in different laboratories. That way, researchers
know that they are studying the same problem, which makes it easier to
arrive at an agreement on important details. Classical conditioning re-
search has focused on several standard examples of conditioning that are
often referred to as conditioning preparations. The range of responses
studied reinforces the point that conditioning controls many different
types of behavior in many different species.
The Nuts and Bolts of Classical Conditioning   85

(A) (B)
100

Conditioned responses (%)


80

60

40

20

0
20-trial blocks

Figure 3.5  Eyeblink conditioning in rabbits. (A) The experimental setup for
measuring nictitating membrane conditioning. (B) Typical acquisition curve.
(A, after Gormezano et al., 1983; B, after Weidemann and Kehoe, 2005.)

Eyeblink conditioning in rabbits


One important conditioning preparation involves the eyeblink reflex in
rabbits. In this method, rabbits are initially habituated to mild restraint
in a stock that keeps them in one place (Figure 3.5A). The stock can then
be put in a soundproof enclosure that reduces extraneous noise and light.
The rabbit is then exposed to brief (typically about one-half second) tones
and light CSs that are paired with USs that consist of either a puff of air to
the cornea of the eye or a mild electric shock (usually about one-tenth of a
second) delivered near the eye, which cause the rabbit to blink. And, after
a number of trials, the rabbit begins to blink in response to the CS (Figure
3.5B). The response that many investigators actually measure is closure
of the nictitating membrane, a third inner eyelid that rabbits have that
sweeps across the eyeball when the rabbit blinks. The eyeblink condition-
ing method was developed and promoted by Isadore Gormezano and his
associates, who did an impressive amount of work uncovering the variables
that affect the conditioning (e.g., Gormezano, Kehoe, & Marshall, 1983).
The eyeblink CR is easy to measure and observe. Conditioning proceeds
so that, with more and more pairings of the CS and US, the CS comes
to elicit the eyeblink just before the US is about to occur. An interval of
about four-tenths of a second between the onset of the CS and the onset of
the US is about optimal for getting good eyeblink responding to the CS.
With typical procedures, the conditioned response itself begins to happen
regularly after a few hundred CS-US pairings. One attractive feature of the
eyeblink method is that the conditioned response is simpler than the types
of responses that are measured in other conditioning preparations (see
below). This simplicity, plus the fact that so much systematic behavioral
research has been done with the eyeblink method, has made it possible
86  Chapter 3

for significant headway to be achieved in understanding the neural basis


of conditioning and learning (e.g., Christian & Thompson, 2003; Freeman
& A. B. Steinmetz, 2011; J. E. Steinmetz, 1996; Thompson & Krupa, 1994;
Thompson & J. E. Steinmetz, 2009).

Fear conditioning in rats


Classical conditioning has been thought to control emotional responses
ever since John B. Watson and the early days of behaviorism. Recall from
Chapter 1 that Watson and Rayner (1920) showed that Albert, a 11-month-
old infant in a day-care setting, learned to be afraid of a white rat when the
rat was paired with the banging of a steel rail. Watson and Rayner showed
that Albert acquired a fear of the rat that generalized to other white, furry
objects. Conditioning is still widely assumed to be a basis for the learning
of fears and phobias.
Fear conditioning, which is sometimes also called “threat condition-
ing” (e.g., LeDoux, 2012, 2015), is now typically conducted with rats as
the subjects rather than as the conditional stimuli. A time-honored method
is known as conditioned suppression or the conditioned emotional
response (CER) technique. The rat is first trained to press a bar in a stan-
dard Skinner box for a food reward (Figure 3.6A). Then, after the rat is
pressing the bar at a regular rate, the experimenter presents a light, tone,
or noise CS, and this stimulus is paired with a brief (0.5 second) and mild
electric shock delivered through the floor of the box. (The typical shock
feels like a tingling sensation to a human’s hand.) The bar-press response
has nothing to do with the presentation of the CS or US. After the CS has
been paired with shock several times, however, the rat will stop pressing
when the CS is turned on. The extent to which bar pressing is suppressed

(A) (B)
0.5

0.4
Suppression ratio

0.3

0.2

0.1

0
1 2 3 4 5 6
2-trial blocks

Figure 3.6  Conditioned suppression in rats. (A) The experimental setup. (B)
Typical acquisition curve. The measure of conditioning is the suppression ratio
(see text for further explanation). (After Hall, Prados, and Sansa, 2005.)
The Nuts and Bolts of Classical Conditioning   87

by presentation of the CS is a measure of conditioned fear. Often, the rat


stops pressing the bar because the CS elicits freezing (e.g., Bouton & Bolles,
1980), which has become an important measure of conditioning in its own
right (e.g., Holland, 1979; Kim & Fanselow, 1992; Leung & Westbrook, 2010;
Rudy & O’Reilly, 1999).
In the conditioned suppression situation, researchers use a standard
technique to express the extent to which the CS suppresses the bar-press
response. First, they count the number of bar presses made during the CS
and during an equal time period just before the CS (often called the “pre-CS
period”). Then they calculate a suppression ratio by taking the CS count
and dividing it by the sum of responses made during the pre-CS period
and the CS. The ratio has a value of 0.5 when the CS does not change the
bar-press rate, but it goes down to zero as the CS becomes more and more
effective at suppressing the bar-press rate (Figure 3.6B).
In this method, the CS durations are relatively lengthy, ranging from
30 seconds to 3 minutes. In typical procedures, conditioning appears to
be quite strong after only four to eight conditioning trials (see Figure 3.6);
it is fairly easy to show significant learning after only one CS-US pairing.
Conditioning can also occur when the CS and US are separated by many
seconds (e.g., Kamin, 1965). The picture is somewhat different from the one
presented in eyeblink conditioning. It is nonetheless interesting to observe
that the basic variables that affect the strength of conditioning are similar
in the two methods.
Conditioned suppression has been an important method for several
reasons. Fear conditioning continues to have implications for cognitive-
behavioral treatments of fear and anxiety in clinical psychology (e.g., Bou-
ton, 2002; Bouton et al., 2001; Craske et al., 2014; LeDoux, 2015; Mineka,
1985, 1992). In addition, as we will see in Chapter 9, conditioned emotions
are worth studying because they play a role in motivating instrumental
behavior. Finally, a number of conditioning’s most important phenomena
were first discovered and investigated with conditioned suppression.
This method was first used by W. K. Estes and B. F. Skinner (1941), but
in the 1960s it became an important method for studying conditioning
thanks to the work of Leon Kamin (e.g., 1965; 1969) and Robert Rescorla
(e.g., 1968b).

Autoshaping in pigeons
In Chapter 2, I noted that animals approach signals for good unconditional
stimuli—a conditioned behavior known as “sign tracking.” The first sys-
tematically investigated example of sign tracking is known as autoshap-
ing. Brown and Jenkins (1968) were faced with the task of getting a group
of pigeons to learn to peck a plastic disk on the wall to get food (Figure
3.7). (Pecking at the disk is recorded by a device that is a lot like a tele-
graph key; hence, the disk is usually known as a “key.”) Instead of shap-
ing the response by reinforcing successive approximations, Brown and
Jenkins merely illuminated the key for 8 seconds and then gave food to
88  Chapter 3

(A) (B)

1.0

Peck rate (pecks/s)


0.75

0.5

0.25

Figure 3.7  Autoshaping in pigeons. (A) The experi-


mental setup. (B) Typical acquisition curve. (After 0 5 10 15 20
Balsam and Payne, 1979.) 20-trial blocks

the pigeons. After about 45 such pairings of the “keylight” and food, the
birds began pecking the key when it was lit. The pecking response was
shaped automatically, and ever since, pecking behavior created this way
by experimenters has been known as autoshaping. Today, pigeons can be
shown complex visual stimuli generated by computer and presented on
monitors attached to the wall (e.g., George & Pearce, 1999, 2003; see also
Chapter 8). Autoshaped pecks at the stimuli on the monitor can be detected,
for example, by photocells—touchscreens are also often used.
The interesting thing about the autoshaped pecking response is that
the bird does not have to peck the key to get the food. The arrangement
is Pavlovian; the food is presented whether or not the bird pecks the key.
It is almost as if the bird cannot resist pecking the food signal. This sort
of conclusion is suggested almost humorously by several results. For ex-
ample, the key can be placed at the end of a long box with food that is made
available for a few seconds from a hopper positioned some distance away.
If illumination of the key is paired with food, the bird will approach and
peck the key even though doing so prevents getting back to the hopper in
time to get the food (Hearst & Jenkins, 1974). We will consider similar data
in Chapter 10. Since the 1970s, autoshaping in pigeons has been used as
a method for investigating how animals associate signals with food (e.g.,
Locurto, Terrace, & Gibbon, 1981).
With typical procedures, autoshaping develops after about 40 CS-US
pairings. Pigeons will associate the keylight with food when the food is
separated from the keylight by as much as 6 to 12 seconds (Balsam, 1984;
Kaplan, 1984; Lucas et al., 1981). In terms of the number of trials to acquisi-
The Nuts and Bolts of Classical Conditioning   89

(A) (B)
20

Mean number of responses


16
During CS
12

4
Pre-CS

0
3 6 9 12 15 18
20-trial blocks

Figure 3.8  Conditioned food-cup entry in rats. (A) The experimental setup.
(B) Typical acquisition curve. (After Harris, 2011.)

tion as well as the delay permitted between the CS and the US, autoshaping
falls somewhere between eyeblink and fear conditioning.

Appetitive conditioning in rats


Rats, of course, will likewise sign track and approach signals for food (e.g.,
Karpicke et al., 1977; Robinson & Flagel, 2009; Saunders & Robinson, 2013).
Another measure of conditioning, however, takes advantage of the fact
that rats also approach the food cup to which a food pellet is about to be
delivered. This behavior is often called goal tracking (e.g., Boakes, 1977).
In a typical laboratory setup, the experimenter delivers a food pellet to a
food cup positioned behind the wall of the conditioning chamber (Figure
3.8A). The rat retrieves the pellet by sticking its snout in the food cup
through a hole in the wall. Each “food-cup entry” is detected by a photocell
beam. Over a series of trials in which a CS is paired with delivery of a food
pellet US, the rat will enter the food cup more and more frequently during
the CS (Figure 3.8B). Responding precedes (anticipates) delivery of the
US. The method is called the conditioned food-cup entry or magazine
approach procedure.
The duration of the CS used in this kind of experiment is typically 10
or 30 seconds. Conditioned responding develops within 30 trials or so.
Although most responding occurs during the CS, the rat also responds a
bit during the intervals between the CSs. This tendency is shown in Figure
3.8B as responding in the period just before each CS presentation (“pre
CS”). Experimenters therefore need to define conditioned responding as
the extent to which presenting the CS increases the frequency of respond-
ing over this baseline. You might wonder whether food-cup entries are
actually operant responses that are reinforced by being associated with
90  Chapter 3

food instead of being Pavlovian responses to the CS. Evidence suggests,


however, that they are controlled by the Pavlovian CS-US relationship more
than the operant response-reinforcer relationship (e.g., Harris, Andrew,
& Kwok, 2013; Holland, 1979a). Since the 1980s and 1990s, the food-cup
entry procedure has become a widely used method for studying Pavlovian
conditioning (e.g., Delamater & Oakeshott, 2007; Harris, 2011; Haselgrove
et al., 2010; Rescorla, 2006a).

Taste aversion learning


As described in Chapter 2, rats will come to reject a flavor that is associ-
ated with the injection of a drug that makes them nauseated. Here, the
flavor stimulus (e.g., saccharin) is a CS, whereas the drug injection is a US.
Taste aversion learning is one of the more rapid and robust examples of
classical conditioning. Under the right conditions, a strong aversion can
be learned after only one conditioning trial. Also, as discussed in Chapter
2, aversions can be learned when the CS and US (flavor and injection) are
separated by up to several hours (see Figure 2.16). These features of taste
aversion learning (among others) led many investigators to conclude that
taste aversions are a unique and highly specialized form of learning (e.g.,
Rozin & Kalat, 1971; Seligman, 1970). There is no question that some things
appear to make taste aversion learning special; we will consider the ques-
tion more carefully in Chapter 6. Many researchers, however, have argued
that including taste aversions in the list of Pavlovian learning systems is
not only reasonable but has invigorated—and provided important insights
into—the general principles of classical conditioning (e.g., Domjan, 1983).
One attraction to taste aversion learning is that it appears to be an
example of how animals learn about foods. By understanding taste aver-
sions, one can begin to understand how animals come to select and avoid
certain foods. Taste aversions also have important applications outside the
laboratory. For example, some of the drugs used to treat cancer are toxic and
cause severe nausea. Cancer patients undergoing treatment can lose their
appetite; indeed, loss of appetite and malnourishment may be a significant
factor in death due to cancer (Morrison, 1976). Ilene Bernstein showed
that children entering a Seattle clinic to get chemotherapy would learn an
aversion to a novel ice cream that they ate a few minutes before treatment
(Bernstein, 1978; see also Bernstein & Webster, 1980). The aversion required
an actual pairing of the ice cream with the chemotherapy treatment and
was specific to the ice cream that was eaten before the treatment. Taste
aversion learning—or something like it—appears to play a significant role
in chemotherapy situations (see Burish, Levy, & Meyerowitz, 1985; Scalera
& Bavieri, 2009).

Things That Affect the Strength of Conditioning


A number of factors affect whether or not conditioning occurs when CSs
and USs are presented, or more usually, how good the conditioning is that
The Nuts and Bolts of Classical Conditioning   91

(A) Delay conditioning Figure 3.9  Different ways to


present CS and US in time.
CS
US

(B) Trace conditioning

CS
Trace interval
US

(C) Simultaneous conditioning

CS
US

(D) Backward conditioning

CS
US

Time

results after this type of presentation. Let us review several of the basic
variables that affect how well classical conditioning is learned. Some of
these variables were introduced in Chapter 2.

Time
Time is a fundamentally important factor in classical conditioning. This
makes sense from the functional perspective developed in Chapter 2: Other
things being equal, animals should be sensitive to the closeness with which
CS and US occur in time. However, this is only one of the ways in which
classical conditioning is sensitive to time.
As a rule of thumb, conditioning works best when the CS occurs before
the US; the CS must signal that a US is about to happen. (I use “rule of
thumb” to describe a decent, but not totally infallible, rule to go by.) Figure
3.9 describes a number of ways in which CS and US can be presented in
time. In delay conditioning, the CS comes on and then ends with pre-
sentation of the US (see Figure 3.9A). This is an excellent way to produce
conditioning, although the amount of conditioning will decrease if the
interval of CS onset to US onset exceeds some value. This interval depends
on the conditioning preparation; in eyeblink conditioning, for example,
the optimal interval between CS onset and US onset is 0.4 second, with
little conditioning occurring at all when the interval exceeds 2 or 3 seconds
(Gormezano, Kehoe, & Marshall, 1983; Smith, 1968). In conditioned sup-
pression, an interval of 180 seconds is quite effective (e.g., Kamin, 1965).
92  Chapter 3

Another arrangement is trace conditioning, in which the CS and US are


separated by a gap (the “trace interval”) (see Figure 3.9B). Trace conditioning
gets its name from the idea that some neural “trace” of the CS, rather than
the CS itself, is paired with the US. Trace procedures can produce good learn-
ing, but conditioning gets worse as the trace interval increases (e.g., Balsam,
1984; Smith, Coleman, & Gormezano, 1969; see Chapter 2). The most obvious
reason conditioning decreases with longer trace intervals is that the animal
may begin to forget the CS over longer and longer gaps in time (cf. Wagner,
1981). There are other reasons, too. For instance, with longer trace intervals,
the animal might not discriminate the trace interval (which soon ends in the
US) from the interval of time that occurs between trials (which does not end
in a US). To test this possibility, a second stimulus can be presented during
either the trace interval or the interval between trials (the so-called intertrial
interval). Either stimulus will increase the conditioning that develops to the
CS (Bolles et al., 1978; Kaplan & Hearst, 1982).
A third way to arrange the CS and US in time is to present them both
simultaneously (see Figure 3.9C). This is called simultaneous condition-
ing. Because the CS does not really signal that the US is about to occur in
this arrangement (by definition, the US is already happening), simultane-
ous conditioning is often believed to yield weak conditioning. The ar-
rangement does cause weaker conditioning than a delay procedure with
a short CS-US interval (Heth, 1976), but John J. B. Ayres and his associates
have shown that the simultaneous procedure can often produce surpris-
ingly good conditioning (e.g., Burkhardt & Ayres, 1978; Mahoney & Ayres,
1976). Most responses that are used to measure conditioning are performed
because they help the animal deal with an upcoming US, and this may lead
us to underestimate the amount of learning that really does occur with a
simultaneous procedure (Matzel, Held, & Miller, 1988). As usual, it is use-
ful to distinguish what a subject does (i.e., its performance) from what it
actually knows (i.e., its learning).
The final arrangement shown in Figure 3.9D is called backward con-
ditioning because the CS follows, rather than precedes, the US in time.
The backward procedure usually does not produce as much conditioned
responding as forward pairings of the CS and US, although you often do get
some responding (e.g., Ayres, Haddad, & Albert, 1987). The rule of thumb
I gave above seems to handle this result nicely: If the CS signals anything,
it signals the interval before the next trial—that is, a period with no US
(Moscovitch & LoLordo, 1968). In fact, the subject often treats a backward
CS as a signal for “no US” (a conditioned inhibitor, see below), although
the reason it does so is a matter of controversy. One possibility is that it
signals a period of no US. Another is that it is associated most directly with
a reaction to the offset of the US (e.g., Solomon & Corbit, 1974; Wagner,
1981; see Maier, Rapaport, & Wheatley, 1976). In fear conditioning, for
example, the onset of a shock US can arouse fear, whereas its offset may
elicit something like relief. A backward CS may appear to signal “no US”
because it is associated with relief (see Chapter 4).
The Nuts and Bolts of Classical Conditioning   93

Intertrial interval Figure 3.10  Trial spacing in


US Pavlovian conditioning. Bullets
Massed trials
CS (•) indicate presentation of a US.
Conditioning is better when trials
are spaced rather than massed in
Spaced trials
time (line 2 versus line 1). Time is
CS relative, however. If the duration
of the CS and the time between
the trials are both multiplied
“Massed” trials by the same factor, there is no
CS benefit to trial spacing; that is,
conditioning would be about the
Time same in lines 1 and 3.

Another well-known effect of time in conditioning is illustrated in Fig-


ure 3.10. Conditioning is better if the conditioning trials are spread out
in time (spaced trials) than if they occur close together in time (massed
Au: Caption explains bullets = US, but could opt for a label leader in art.
trials). There are several reasons spaced trials may cause better condition-
See one is added in the top line. OK as set?
ing than massed trials. One idea is that learning requires that the subject
“rehearse” the CS and US together in memory for a while after each con-
ditioning trial (e.g., Wagner, Rudy, & Whitlow, 1973). A new trial could
interrupt rehearsal of the last one if it occurs too soon (see Chapter 4 for
other, related possibilities). Another interesting finding, though, is that if
the time between trials and the time in the CS are both increased by the
same factor, there may be no benefit to spacing the trials. For example, if
they are both tripled, as in the lowest line in Figure 3.10, the subject can
perform poorly, as if the trials are massed (e.g., Gibbon et al., 1977; see also
Holland, 2000, and Lattal, 1999). John Gibbon, Peter Balsam, and Randy
Gallistel have suggested that the success of conditioning depends on the
ratio between the time between trials and time in the CS; the ratio is bigger
when time between trials is increased (see Figure 3.10, line 1 versus line
2), but not if the time in the CS and the time between trials are both multi-
plied by the same factor (see Figure 3.10, line 1 versus line 3; e.g., Gibbon
& Balsam, 1981; Gallistel & Gibbon, 2000).

Novelty of the CS and the US


Conditioning occurs most rapidly if the CS and US are new to the subject
when conditioning first begins. Exposure to either stimulus before it is
paired during conditioning can interfere with learning.
Exposure to the CS before conditioning can reduce how quickly ani-
mals learn about it. The finding that “preexposure” to the CS can interfere
with conditioning is called latent inhibition (e.g., Lubow, 1973). The more
exposures to the CS, the more interference one observes with condition-
ing (e.g., Siegel, 1969). The simplest explanation begins by assuming that
the subject must pay attention to the CS for conditioning to occur. If you
were exposed to a ringing bell repeatedly at the start of an experiment,
94  Chapter 3

you might initially pay attention to it, but your initial attention might de-
crease—habituate—with more and more exposures. During preexposure,
the subject may come to pay less and less attention to the CS; preexposure
might habituate an attentional response to the CS that is necessary for good
conditioning to occur.
A related effect happens with preexposure to the US. Repeated exposure
to the US alone before conditioning can reduce its effectiveness as a US
(e.g., Randich & LoLordo, 1979). The effect is called the US preexposure
effect: Exposure to the US before conditioning has occurred can retard
subsequent conditioning. Once again, the more preexposure, the worse the
conditioning is later. Habituation may be involved again. By presenting
the US repeatedly before conditioning, you may habituate some effect of
the US that is necessary for good conditioning.
Latent inhibition and the US preexposure effect have significant impli-
cations for learning in the real world. Consider the conditioning of fears
and phobias. Because of latent inhibition, a person is less likely to learn to
associate familiar stimuli with a traumatic event; instead, more condition-
ing to novel stimuli will occur. Similarly, a less novel (or more familiar)
traumatic US might be less effective at causing fear conditioning. In ad-
dition to their practical value in understanding real-world conditioning,
both latent inhibition and the US preexposure effect have been important
in shaping theories of conditioning, as we will see in Chapter 4.

Intensity of the CS and the US


I mentioned in Chapter 2 that conditioning is better—or the strength of
the conditioned response is stronger—the stronger or more “intense”
the US. For example, stronger electric footshocks yield stronger fear con-
ditioning (Annau & Kamin, 1961; Morris & Bouton, 2006). Intensity of
the US—or the US’s magnitude—roughly determines the upper limit of
learning.
The intensity of the CS is also important. Thus, a quiet bell, noise, or
tone is less effective than a louder one. Conditioning theorists often speak
of the “salience” of the CS: The louder or brighter the CS, the more salient
it is. Roughly speaking, salient CSs are especially attention-grabbing. That
may explain why they are more effective in conditioning.
There are limitations to the effect of CS intensity, though. Imagine try-
ing to condition a dog to salivate to the sound of a very loud fire bell.
The ringing fire bell is a more intense stimulus than the quietly ringing
bell that Pavlov usually used, but it is quite doubtful that the dog would
salivate more to the ringing fire bell because the sound would also be star-
tling and frightening. Turning it on would frighten the dog and perhaps
make the bell quite poor at eliciting a drooling reflex. More intense CSs are
not always better than weaker CSs in conditioning experiments because
strong stimuli often elicit responses of their own. Therefore, salient—but
not overpowering—stimuli are the most effective as CSs in conditioning
experiments.
The Nuts and Bolts of Classical Conditioning   95

Pseudoconditioning and sensitization


The fact that CSs can elicit responses of their own can actually complicate
how one interprets the results of conditioning experiments. The problem
is especially important if the CS naturally elicits a response that looks like
the response that is supposed to be conditioned. For example, if present-
ing a light caused the subject to blink its eye without any conditioning, it
would obviously be difficult to separate blinking that might result from
conditioning from this natural blinking to the light.
The problem is actually more subtle than that. Suppose we run a subject
who receives a number of pairings of a light CS and an air puff US in a
simple eyeblink conditioning experiment. If blinking to the CS started at a
very low level and then increased on each trial, we might feel safe conclud-
ing that the subject was learning to associate the CS and the US. Unfortu-
nately, there are two processes besides true conditioning that might make
the subject respond more and more to the CS. A careful experimenter al-
ways needs to separate these counterfeit processes from true conditioning.
One counterfeit process is sensitization, the phenomenon we saw in
Chapter 2. Recall that a stimulus that is simply presented by itself on sev-
eral trials can sometimes evoke a stronger and stronger response over trials.
(Sensitization is the opposite of its better-known cousin, habituation.) It is
possible that our light CS starts with a weak natural tendency to elicit an
eyeblink response. If sensitization were to occur, we would see an increase
in blinking to the light over trials that would not depend on learning to as-
sociate the CS and the US. Although the response increase could be due to
repeated exposure to the CS on its own, exposure to the US can sensitize a
response elicited by the CS as well (e.g., Mackintosh, 1974). Remember that
the sensitization process initiated by exposure to a stimulus is thought to
create a general state of arousal or excitability (Groves & Thompson, 1970).
A second counterfeit process that can look like true conditioning is
known as pseudoconditioning. Pseudoconditioning is the development
of a response to the CS that might occur because of mere exposure to the
US. It differs from sensitization in that the counterfeit response to the CS
is entirely new; sensitization, in contrast, is the augmentation of an exist-
ing response that is already elicited at least a little by the CS. For example,
in our hypothetical eyeblink experiment, the flash of light (the CS) might
never elicit an eyeblink on its own in our subject, but a few exposures to
the airpuff might be sufficient to make the subject blink to the light simply
because they are both sudden stimuli; the blink elicited by the airpuff might
generalize to the sudden flash of light. This is not true conditioning; it does
not depend on associating the CS and the US (or the CS and the UR).
Experimenters can use control groups to help reduce their concerns
about sensitization and pseudoconditioning. Because pseudoconditioning
and sensitization can occur due to exposure to the US alone (the airpuff,
in our example), a group could be given the same exposures to the US
alone and then tested for a response to the CS. If the conditioning group
96  Chapter 3

responded more to the CS, we could be a little more confident that the
response resulted from true conditioning. Similarly, because sensitization
can be due to exposure to the CS alone, a group could be given the same
exposures to the CS. If the conditioning group again responded more to
the CS, the difference could not be due to sensitization to the CS.
Pseudoconditioning and sensitization are possible in any conditioning
experiment. Ilene Bernstein recognized this possibility in the taste aversion
learning experiment with children receiving chemotherapy that I mentioned
earlier (Bernstein, 1978). Bernstein noticed that the drugs used to treat cancer
often make people very nauseated; she also knew that cancer patients often
lose their appetite for food. Could chemotherapy be conditioning taste aver-
sions to the patients’ food? To find out, Bernstein ran an experiment on taste
aversion learning in children who came to a clinic to receive chemotherapy.
The experimental group received a novel ice cream (Mapletoff, a combina-
Ilene Bernstein
tion of maple and marshmallow flavoring) before receiving a drug that was
known to make them nauseated. When they next returned to the clinic, they
were given a choice between eating another dish of Mapletoff or playing a
game. The children who had received Mapletoff before their drug treatment
rejected Mapletoff and chose instead to play the game (Figure 3.11).
Had the children learned an aversion to Mapletoff? One possibility is
that they rejected it merely because being made ill could have decreased
their interest in any ice cream. To check, Bernstein included a group that
received the same kind of chemotherapy—but no Mapletoff—on the ex-
perimental visit. (They played with a toy instead.) These children did
not reject Mapletoff during the subsequent test; in fact, a majority chose
it over the game. The rejection shown by the experimental group was
therefore not due to pseudoconditioning or sensitization created by the
US. Was the rejection due to sensitization due to mere CS exposure? In
this case, sensitization would be a loss of preference for Mapletoff after
a simple exposure to it. To check, Bernstein included a second control
group that received the Mapletoff on the “conditioning” day but did
not receive a drug that made them sick. These children did not reject
Mapletoff ice cream on the subsequent test either; the rejection by the
experimental subjects was not due to sensitization. By testing control
groups for both pseudoconditioning and sensitization, Bernstein was able
to conclude that the rejection of Mapletoff in her experimental subjects
was due to the explicit combination of Mapletoff and chemotherapy; it
was true aversion conditioning.

Events paired on Percentage who chose


conditioning day Mapletoff on test day

Mapletoff — Treatment 21%


Figure 3.11  Design and results of
Bernstein’s (1978) experiment on Toy — Treatment 67%
taste aversion learning in children Mapletoff — No treatment 73%
receiving chemotherapy.
The Nuts and Bolts of Classical Conditioning   97

It is possible to control for both pseudoconditioning and sensitization in


a single control group that receives as many CSs and USs as the experimen-
tal group but in a way that does not allow conditioning. Some possibilities
are to present the CS and US separated by an amount of time that does not
support conditioning, or in a backward manner, or randomly in time. It
would be a mistake, though, to assume that subjects do not learn anything
with these procedures (see the next sections). The methods nevertheless still
control for pseudoconditioning and sensitization if they give the subjects
equal exposure to the CS and the US.

Conditioned Inhibition
We have been talking about the kind of conditioning that occurs when
CSs are associated with USs. Another type of conditioning, however, oc-
curs when CSs are associated with the absence of USs. The earliest work
on this kind of learning was once again done by Pavlov. Because he was
a physiologist (rather than a psychologist), Pavlov saw this second type
of conditioning as an example of a process that is known to exist in the
nervous system: inhibition. Pavlov’s neuroscientific vocabulary is retained
today. Thus, when a CS is associated with a US, we speak of conditioned
excitation, but when a CS is associated with the absence of a US, we speak
of conditioned inhibition. CSs with conditioned excitation and inhibition
are known as conditioned excitors and inhibitors, respectively.
Conditioned inhibition is as fundamental to modern research as con-
ditioned excitation. Excitation and inhibition are thought to be opposites.
For example, in fear conditioning, an excitor (a CS associated with shock)
excites a fear or anxiety state when it is presented. Conversely, an inhibi-
tor (a CS associated with no shock) inhibits fear, signals safety, or causes
“relief.” In situations in which the US is food, an excitor (a CS paired with
food) elicits a state of appetitive excitement. In contrast, an inhibitor (associ-
ated with no food) might inhibit that state and cause frustration. Excitation
and inhibition are both motivationally significant (see Chapter 9).
Pavlov first encountered inhibition when he studied extinction. You will
remember that extinction happens when a CS that has been paired with a
US is subsequently presented repeatedly without the US. The conditioned
response is gradually lost, but Pavlov knew that it was not “unlearned”; he
and his students had observed spontaneous recovery and other phenom-
ena. Thus, to Pavlov, the conditioned reflex must have been inhibited in
extinction. Inhibition developed in extinction and opposed or subtracted
from the original excitatory reflex. For some reason, it was more “labile”
than excitation so that it was lost when time passed or when some distrac-
tion occurred.

How to produce conditioned inhibition


Pure conditioned inhibition can be produced in several different ways (see
LoLordo & Fairless, 1985). As a rule of thumb (remember, such rules are not
98  Chapter 3

infallible!), it develops when a CS occurs and signals no US. One method


is known as differential inhibition or discriminative inhibition. One CS
(call it A) is repeatedly paired with a US, whereas another CS (call it X) is
repeatedly presented without a US on other trials. Not surprisingly, the
animal will come to respond to A but not to X. The subject discriminates
between the two stimuli. Often, X becomes a conditioned inhibitor. In fact,
excitation and inhibition are thought to be conditioned in most basic dis-
crimination procedures.
A second method is the conditioned inhibition procedure. (Like many
effects in conditioning, the term describes both a result and a procedure.) In
this case, one CS (A) is paired with a US, and on other trials it is presented
together with a second stimulus (X). (A and X presented together make up
a compound conditional stimulus.) The compound AX stimulus is then
presented without a US. Not surprisingly, subjects will respond when A is
presented by itself but will learn to not respond to AX. Casually speaking,
X signals a trial when A will not be followed by a US. Interestingly enough,
if X is removed and tested by itself, the subject treats it as if it signals no
US; it has the properties of a conditioned inhibitor (described in the next
section). The conditioned inhibition procedure is thought to be one of the
most fundamental and effective ways to condition inhibition.
Other methods involve presenting a CS (X) alone and then presenting
a US far away from it in time. The CS and US are never paired; in fact,
they are “unpaired,” and the method is sometimes called the explicitly
unpaired procedure. Because the CS and US are negatively “correlated”
in time, the procedure is sometimes described as negative correlation. In
this case, X sometimes acquires inhibition, too.
In some cases, inhibition can actually develop even when a CS always
ends in a US. In inhibition of delay, a US is presented at the end of a lengthy
CS. With many conditioning trials, the animal behaves as if the early part
of the CS signals a period of no US (e.g., Rescorla, 1967a; Rosas & Alonso,
1997). A final method is one mentioned previously: backward conditioning
(see Figure 3.9D). Here, the CS occurs after the US. As described above, this
procedure sometimes establishes the CS as a signal for no US—that is, as a
conditioned inhibitor. Some theories now assume that backward condition-
ing is one of the most fundamental ways to produce inhibition.

How to detect conditioned inhibition


Knowing that you have conditioned inhibition can be a little tricky. It is
easy to recognize a conditioned excitor when you find one: The subject
responds to the CS as if it expects a US. It is more difficult to recognize a
conditioned inhibitor, though, because a signal for “no US” will not nec-
essarily cause a behavior that is different from the behavior elicited by a
stimulus that signals nothing at all. Detecting or measuring conditioned
inhibition is therefore a rather indirect business.
There are two methods used to measure inhibition (Rescorla, 1969b; Wil-
liams, Overmier, & LoLordo, 1992). Both capitalize on inhibition and excita-
The Nuts and Bolts of Classical Conditioning   99

Figure 3.12  Hypothetical ef-


Conditioned responding fects of presenting an excitor
alone (left), the excitor togeth-
er with an inhibitor (middle),
and the excitor together with
another excitor (right). As a
rule of thumb, CSs “summate”
when they are presented
together. Inhibitors subtract
from excitors, as if they have a
negative value.
One excitor One excitor Two excitors
+
One inhibitor

tion being viewed as opposites. The first method is known as a summation


Au: Identify dashed line?
test. In it, the inhibitor is presented together with a conditioned excitor. A
true inhibitor will inhibit the response elicited by the excitor: There will be
less responding when the excitor is combined with an inhibitor than when
the excitor is presented alone (Figure 3.12). When the excitor is combined
with a neutral stimulus, there should be less of a decrease in responding.
(Some loss may occur, though, if the animal does not generalize completely
from the excitor to the novel test compound.)
Excitors and inhibitors behave as if an excitor has a positive value and
an inhibitor has a negative value. The rule of thumb is that when they are
put together, they tend to summate, so adding a negative cue (the inhibi-
tor) subtracts from the effect of the positive cue (the excitor). Consistent
with this scenario, when two excitors are put together, the response to the
compound is more than to either cue alone (e.g., Reberg, 1972; see Figure
3.12). Be careful, however. Excitors and inhibitors do not literally add to one
another. Instead, we are usually interested in knowing that some stimuli
cause more or less responding, so-called ordinal predictions. The results
shown in Figure 3.12 provide an ordering of the stimuli: Two excitors will
cause more responding than one excitor alone, and an inhibitor and an
excitor will cause less responding than the excitor alone.
A second method for measuring conditioned inhibition is the retar-
dation-of-acquisition test. If a conditioned inhibitor is converted into an
excitor by pairing it with a US, responding develops to the CS very slowly;
the acquisition of responding is “retarded” (Figure 3.13) compared to when
the US is paired with a CS that is not a conditioned inhibitor. This retarda-
tion makes sense when you realize that a signal for no US is being turned
into the opposite polarity—or a signal for the US. The results of retardation
tests must be regarded with caution, though, because other things besides
inhibition can cause retarded acquisition. A CS may look like a conditioned
inhibitor in this sort of test if the animal has merely learned not to pay atten-
tion to it, a possibility that should be kept separate from inhibition. In fact,
retardation has even been observed when the CS is actually a very weak
100  Chapter 3

Figure 3.13  Hypothetical effects of pairing an Normal CS paired


inhibitor with a US. Acquisition of conditioned with a US
responding is slower (“retarded”) with an in-

Conditioned responding
hibitor than with a CS that has not received any
previous training.

Inhibitor now paired


with a US

Trials

excitor (Hall & Pearce, 1979). Thus, researchers interested in investigating


inhibition should probably use it only together with the summation test.
Sometimes the measurement of inhibition is less tricky. In bidirectional
response systems, a response can go either above or below its baseline
level; directional changes in the baseline response may correspond to ex-
Au: See query in the text about the term “Normal CS”
citation or inhibition. For example, resting heart rate might increase when
a fear excitor is presented but decrease when a fear inhibitor is presented.
Similarly, in autoshaping, pigeons will approach excitors and withdraw
from inhibitors (e.g., Hearst & Franklin, 1977; Kaplan & Hearst, 1982). In
these cases, it is less difficult to know an inhibitor when you have one be-
cause it evokes a change from baseline that is opposite to the one evoked
by an excitor.

Two methods that do NOT produce true inhibition


I warned you above that the notion that inhibition develops when a CS is
associated with no US was no better than a rule of thumb. This is correct,
of course; the rule is fallible. There are at least two situations in which a CS
can be presented without a US where the CS does not acquire the proper-
ties of a conditioned inhibitor.
Recall that “latent inhibition” refers to the situation in which a CS is
preexposed—without a US—on a number of trials before conditioning
begins. Preexposure to the CS makes it difficult to convert the CS into an
excitor when it is subsequently paired with the US. (Astute readers will
recognize this as a retardation-of-acquisition test.) Despite this result, and
indeed, despite the very term latent inhibition, simple preexposure does
not cause the CS to acquire true inhibition. A CS that is simply preexposed
before conditioning begins fails the summation test. That is, the CS does
not inhibit responding when it is presented together with an excitor (e.g.,
Reiss & Wagner, 1972; Rescorla, 1971). As noted above, preexposure to the
CS may cause the animal to pay less attention to the same CS later. It does
not necessarily mean that the animal learns that the CS signals “no US.”
The Nuts and Bolts of Classical Conditioning   101

One idea is that to acquire true inhibition, a CS must occur without a US when
that US is otherwise expected (e.g., Wagner & Rescorla, 1972). The fact that
the latent inhibition procedure does not result in true inhibition has led
some investigators to call it by a name that is more theoretically neutral:
the CS preexposure effect.
Ironically, given Pavlov’s original discoveries, another place in which
a CS is associated with no US—and does not acquire true inhibition—is
extinction. To meet today’s definition of inhibition, an extinguished CS
would need to pass both summation and retardation tests. Reberg (1972)
arranged the following summation test: Two CSs were separately paired
with a US; then one was extinguished by presenting it alone—over and
over. At the end of this phase, Reberg presented the two stimuli together in
compound. If the extinguished CS had become an inhibitor, he expected it
to subtract from responding to the other cue. Instead, it increased respond-
ing to the other cue. If anything, the extinguished CS was still an excitor
after extinction.
Other experimenters have run the retardation test. For example, rabbits
received conditioning with a CS, then extinction, and then a reconditioning
phase (Napier, MacCrae, & Kehoe, 1992). If the CS had become an inhibitor
in extinction, it should be slow to acquire responding in the reconditioning
phase. Napier and his colleagues found exactly the opposite: Recondition-
ing was very rapid. In fact, rapid reconditioning is often observed after
extinction (but see Bouton, 1986). Once again, if anything, an extinguished
CS looks like an excitor rather than an inhibitor. This fact has interesting
implications that we will discuss later (see Chapter 5). For now, however,
it seems clear that a CS does not become a conditioned inhibitor after
extinction because the CS fails both the summation and retardation tests.

Information Value in Conditioning


To the uninitiated, conditioning looks like an awfully straightforward ex-
ample of learning. When a CS and a US occur together in time, it is not
that surprising that an animal will learn to associate them. What is all
the fuss about? In many ways, the modern era of conditioning research
began in the late 1960s when several discoveries came along that shattered
some cherished intuitions about conditioning. Each discovery suggested
that conditioning does not automatically happen when a CS and a US
are paired: CS-US pairings are not enough to guarantee learning. Instead,
conditioning only occurs if the CS provides information about the upcom-
ing US. It is not possible to appreciate this idea without talking about the
experiments that encouraged it, so let us look at them.

CS-US contingencies in classical conditioning


One line of research was begun by Robert Rescorla while he was a graduate
student at the University of Pennsylvania (Rescorla, 1966, 1967b, 1968b).
Rescorla (1968) presented a series of tone CSs to his subjects. He also pre-
102  Chapter 3

Group US CS Positive contingency +

Strength of conditioning
Excitation
1

Zero contingency 0
1 2
2

Inhibition 3

Negative contingency

Time

Figure 3.14  Procedure used in Rescorla’s experiments demonstrating the


importance of CS-US contingency in conditioning. In Group 1, the probability
of the US (indicated with bullets) is greater when the CS is on than when the
CS is off. Here the contingency between CS and US is positive, and excitatory
conditioning to the CS is observed. In Group 2, the probability of the US is the
Q: Variation
same with the
whether CS lines and
CS is onbars all same
or off. Herecolor.
there is no contingency between CS and
Previous
US, and figures, CS was ano
that produces redconditioning
line? even though the CS and US are paired
together many times. In Group 3, the probability of the US when the CS is on is
less than the probability of the US when the CS is off. Here the contingency be-
tween CS and US is negative, and inhibitory conditioning of the CS is observed.

sented brief shock USs. These were scheduled in different ways for different
groups of subjects. For one group (Group 1), the shock US was scheduled
to occur with a probability of .4 whenever a CS occurred; as shown in Fig-
ure 3.14, two out of every five CSs were paired with a shock. At the end
of several sessions of this training, Rescorla found that the rats were quite
afraid of the CS. When the probability of the US was .4 in the CS, the rats
had learned quite a lot.
A second group (Group 2) received the same CSs and the same USs
(within the CSs) as the first group. As illustrated in Figure 3.14, the subjects
in this group thus received the same number of pairings between CS and
US as the first group. However, the second group also received shocks
when the CS was not on. Rescorla scheduled these extra USs so that they
occurred with the same probability as those scheduled in the CS. That is,
the probability of a shock was .4 in both the presence and the absence of the
tone. The tone was then tested after several sessions of this sort of training.
Amazingly, the subjects in Group 2 acted as if they had no knowledge that
the CS and the US were associated—they showed no fear of the tone. This
result was quite impressive because the second group had had the same
number of CS-US pairings as the first group. Evidently, pairings of a CS
and a US were not sufficient to produce conditioning. In this arrangement, the
CS provided no new information about the occurrence of the US. To get
conditioning, the CS had to predict an increase in the probability of the US.
The Nuts and Bolts of Classical Conditioning   103

Rescorla also ran a third group (Group 3) that received the same expo-
sures to the tone CS. This group also received the shocks in the absence of
the CS that the second group had received, but it did not receive any shocks
during the CS. For this group, the onset of the CS did signal a change in
the probability of the US; in this case, it signaled a decrease in the likeli-
hood of the US. Perhaps not surprisingly, and given what you have read
in the previous section, the rats treated this CS as a safety signal—that is,
a conditioned inhibitor for shock.
For several reasons, Rescorla’s experiment initiated a profound change
in the way we conceptualize conditioning. To emphasize the point, Group
2’s lack of learning indicated that CS-US pairings were not the cause of
conditioning; to produce conditioning, the CS must actually signal an in-
crease or decrease in the probability of the US. CS-US pairings are not good
enough to produce conditioning; the CS must be informative about the US.
The second implication of Rescorla’s results is that excitation and in-
hibition can be regarded as two ends of the same continuum. An excitor
(like the tone in Group 1) is a CS that signals an increase in the probability
of a US; an inhibitor (like the one in Group 3) signals a decrease in its prob-
ability. The fundamental difference is the nature of the relationship, or
contingency, between the two events. In a positive contingency between
the CS and the US, the US is more probable when the CS is on than when
it is off. That is the condition of Group 1; excitation was learned with a
positive contingency. In a negative contingency, the US is less probable
when the CS is on than when it is off. That is the condition of Group 3; in-
hibition was learned with a negative contingency. Group 2’s treatment falls
in between. For this group, the US is equally probable when the CS is on
and when it is off. There is no contingency between the CS and the US. The
fact that nothing was learned here suggests that the lack of a contingency
describes a sort of zero point, with excitation and inhibition being created
by positive and negative contingencies on either side of it.
One warning is in order about this set of circumstances. It is tempt-
ing to conclude that Rescorla’s subjects literally learned (and understood)
the contingency between the CS and the US. Presumably, doing so would
require some fairly sophisticated mental machinery. In point of fact, learn-
ing researchers have never really supposed this conclusion (see Papini &
Bitterman, 1990). Instead, a CS-US contingency simply describes a relation-
ship between the CS and US that will allow excitation or inhibition to be
learned. Neither excitation nor inhibition requires that the subject actually
calculate a contingency or correlation coefficient in its head. We will return
to the question of how positive and negative contingencies lead to excita-
tion and inhibition in Chapter 4.

Blocking and unblocking


Leon Kamin (1968, 1969) reported some equally interesting experiments
with rats in conditioned suppression. The design of one of his experiments
is shown in Figure 3.15A. There were three phases. In the first phase, one Leon Kamin
104  Chapter 3

Figure 3.15  (A) Design of Kamin’s blocking ex- (A)


periment. Both Group 1 and Group 2 received Group Phase 1 Phase 2 Test
an equal number of pairings of the light (L) and
noise (N) compound with shock in Phase 2. For 1 16 N — Shock 8 LN — Shock L?
Group 1, however, the noise had previously 2 — 8 LN — Shock L?
been associated with shock, making the light
a redundant predictor of shock. Prior condi-
tioning with the noise “blocked” conditioning (B)
with the light. (B) Results of the test trials. The 0.5
measure is the suppression ratio; less condi-
tioning is indicated by a higher ratio. (After
Kamin, 1969.) 0.4

Supression ratio
0.3

0.2

0.1

0
1 2
Group

of the groups received 16 trials in which a noise CS (N) was paired with a
footshock US (Shock). This was enough training to produce considerable
responding whenever the noise was presented. In the second phase, this
group received 8 further trials in which a light (L) was added to the noise to
make a compound CS (LN). A second group received the same 8 pairings of
the light and noise compound and the shock US, but the noise had not been
conditioned before. In a third phase, Kamin simply tested conditioning to
the light by presenting it alone (L). Both groups had received 8 pairings of
the light with the shock. If pairings of light and shock were all that were
required to induce conditioning, the groups should not have differed in
their conditioning to the light.
What Kamin found was very different and very interesting. The results
are shown in Figure 3.15B. The group that had received the light combined
with the noise showed good conditioning. But the group that had the light
combined with a noise that had previously been conditioned showed no
evidence of learning at all with the light. Prior conditioning with the noise is
said to have “blocked” conditioning of the light. This result is called the
blocking effect.
Blocking is important because it again suggests that conditioning does
not simply happen because a CS and a US are paired. Kamin suggested
that the blocking group did not need to learn much about the light because
The Nuts and Bolts of Classical Conditioning   105

(A) (B)
Group Phase 1 Phase 2 Test 0.5

1 N — Shock LN — Shock L?
0.4
2 N — Shock LN — SHOCK!! L?
3 N — SHOCK!! LN — SHOCK!! L?

Supression ratio
0.3
Figure 3.16  (A) Design of Kamin’s unblock-
ing experiment. For Groups 1 and 3, the
light was redundant to the noise in predict- 0.2
ing shock, and the noise blocked condi-
tioning to the light. For Group 2, however, 0.1
when the light was added to the noise, it
signaled an increase in the intensity of the
shock, which allowed conditioning of the 0
light, or “unblocking.” (B) Results of the 1 2 3
test trials with the light. Less conditioning Group
is indicated by a higher suppression ratio.
(After Kamin, 1969.)

the noise already predicted the shock. In effect, the light was redundant to
the noise in predicting shock. The experiment thus suggests that learning
only occurs when the CS provides new information about the US. When it
does not predict anything new, relatively little learning occurs.
Kamin reported a second result that further supported this interpreta-
tion. If learning occurs when a CS predicts something new, perhaps the
rat will learn about the light if the light is made to predict something new
during the second phase. The design of this next experiment is shown in
Figure 3.16. As before, one group received N-shock training followed by
LN-shock. When the light was tested alone, blocking was again observed.
A second group received the same N-shock conditioning in the first phase,
but in the second phase—when the light was added to the noise—the com-
pound was paired with a stronger shock (SHOCK!!). Here, the light did
predict something new, and as predicted, the rats learned about it fine. A
final group received the same LN compound trials with the larger shock,
but previously the noise had been associated alone with the same larger
shock. In this case, the light did not predict anything new, and once again
blocking was observed. The results of this experiment and the previous
one suggest an interesting new idea about conditioning. As in Rescorla’s
contingency experiment, pairings of a CS and a US (light and shock) were
not sufficient to produce conditioning. Instead, learning occurred in Ka-
min’s experiments only if the CS predicted something new.
My students at the University of Vermont are familiar with another ex-
ample of the blocking effect once it is pointed out. Because the university is
about 30 minutes south of the Canadian border, nearly all the students have
visited Canada at least once. They may also have handled a little Canadian
106  Chapter 3

paper money while there. The basic monetary unit in Canada is the dollar
(which is worth a little less than the U.S. dollar at this writing), but each
of the different Canadian bills ($5, $10, $20) is printed with both a number
and a color that correspond to the dollar amount. The interesting thing is
that few of my American students can remember the color of, say, the Ca-
nadian $10 bill. One possible reason is that we have had a lot of training
in the United States in which the number printed on the bill is associated
with its value (what the bill can buy). In Canada, the number and color
are both printed on the bill, but the color is redundant to the number. The
cues are analogous to the light and noise in Kamin’s blocking experiment
in that prior learning with numbers blocks learning about colors.

Overshadowing
The blocking effect suggests that CSs that are presented together in a com-
pound tend to compete with one another for conditioning. In blocking,
the blocked CS loses the competition because the other CS is more infor-
mative. The idea that compounded cues compete is also suggested by a
phenomenon known as overshadowing (Pavlov, 1927). In this case, two
CSs are simply presented in compound and paired with the US together.
Even though there is no previous conditioning phase, if one CS is stron-
ger or more salient than the other, it will prevent good conditioning of
the other CS (e.g., Mackintosh, 1976; Prados, 2011). The stronger stimulus
is said to overshadow learning about the weaker stimulus. In principle,
overshadowing can be so complete that there is no apparent learning to
the overshadowed CS. One explanation is that the more salient CS acquires
conditioning so rapidly (as discussed earlier) that it quickly comes to block
conditioning of the slower, less salient CS (e.g., Rescorla & Wagner, 1972).
Overshadowing is a little like blocking, but it differs from Kamin’s blocking
experiment in that it occurs in one conditioning phase.
It is good to know that a salient CS can overshadow a less salient one,
especially if we ever want to prevent conditioning from happening. This
could be the case, for example, if we want to prevent chemotherapy patients
from learning taste aversions to the foods that they eat at home. One way to
reduce such aversion conditioning is to give the patient a salient new flavor
(e.g., a novel candy) just before chemotherapy is administered. Conditioning
of the salient novel flavor can overshadow the conditioning of aversions to
foods eaten in recent meals (Andresen, Birch, & Johnson, 1990; Broberg &
Bernstein, 1987). In a sense, the novel flavor becomes a “scapegoat” that takes
the blame for the sickness experience (Broberg & Bernstein, 1987).

Relative validity in conditioning


Other early evidence for the importance of competition and information
value was provided by Allan Wagner, who was running a number of ex-
periments in the 1960s that were similar to Kamin’s blocking experiment.
In one type of experiment (Wagner et al., 1968), a blocking-like effect was
shown under especially interesting conditions. The design of the experiment
The Nuts and Bolts of Classical Conditioning   107

(A) Conditioning (B) Test of X alone


Group Correlated Group Uncorrelated 100

AX — US AX — US
80
AX — US AX — No US

Trials with a CR (%)


BX — No US BX — US
BX — No US BX — No US 60

40

20

0
Correlated Uncorrelated

Figure 3.17  (A) Design of the “relative validity” experiment. Group Corre-
lated and Group Uncorrelated had the same number of X-US pairings (for both
groups, X was paired with the US half the time X was presented). For Group
Correlated, however, there was a better predictor of US (or no US) also present
on each trial (stimuli A and B, respectively). In contrast, for Group Uncorrelated,
stimulus X was no worse than A or B at predicting US (or no US); all CSs were
paired with the US half the time they were presented. When X was later tested
alone (B), it was much better conditioned in Group Uncorrelated. (After Wagner
et al., 1968.)

is shown in Figure 3.17A. There were two groups. For both groups, a CS
(called X) occurred on every trial; half the time it was presented together
with a stimulus (called A), and half the time it was presented together with
a different stimulus (called B). For one of the groups (Group Correlated), AX
was always paired with a US, and BX always occurred without a US. For
the other group (Group Uncorrelated), AX and BX were each paired with
the US only half the time. The question is, how much would the different
groups learn about X? For both groups, X was paired with the US half the
time (which is ordinarily enough to produce plenty of conditioning). For
both groups, X was also presented with A or B half the time. The difference
is that, for Group Correlated, stimulus A was a perfect predictor of the US
and stimulus B was a perfect predictor of no US. Stimulus X was not as useful
as A or B at predicting the outcomes of the trials. For Group Uncorrelated,
though, the situation was different. Here, stimuli A and B were both imper-
fect predictors of the US; like X itself, each was paired with the US half the
time. For Group Uncorrelated, X was no better or worse than stimulus A or
B at predicting the outcomes of the trials. (In fact, because X was present on
all the trials with the US, it was arguably a little better than A or B.)
The main results are shown in Figure 3.17B. There was more condi-
tioning to X in Group Uncorrelated than in Group Correlated. In Group
108  Chapter 3

User A User B

Odor + Room — Drug Odor + Room — Drug


Odor + Room — Drug Odor + Room — No drug
Friend + Room — No drug Friend + Room — Drug
Friend + Room — No drug Friend + Room — No drug

Figure 3.18  In the real world, conditioning probably always occurs with com-
pounded CSs, and the results may be surprising. For example, two hypotheti-
cal drug users might take a drug 50% of the time they are in a particular room.
Half of the time an odor is also present, and half of the time a friend is present.
Despite similar histories of drug use, according to the results of the relative va-
lidity experiment (see Figure 3.17), User A will eventually crave the drug in the
presence of the odor, but not in the room alone or in the presence of the friend.
In contrast, User B will experience strong drug cravings whenever he is in the
room, or in the presence of the odor, or with the friend. The laws of condition-
ing are always at work and perhaps are more subtle than we often realize.

Correlated, there was little learning about X. (There was a lot of learning
about stimuli A and B.) Apparently, the subject learned about the best
predictors of the US and effectively ignored X. The weaker conditioning
to X in Group Correlated reflected the fact that X was less valid at predict-
ing the US than the cues with which it was compounded (A and B). Thus,
conditioning of X depended on its relative validity. Conditioning is a little
like a competition in which the best predictors win the prize.
The relative validity experiment is not a purely academic exercise. In
the world outside the laboratory, conditioning probably always involves
compounded cues. Figure 3.18 illustrates two fictional histories of drug
use for two drug users. In Chapter 2, I presented evidence suggesting that
conditioning may be involved in drug dependence. Let us assume that drug
craving is elicited by cues that are associated with the drug. If we want to
understand what conditioning research really has to say about drug depen-
dence, we must acknowledge that natural conditioning probably involves the
conditioning of compounded cues and that the amount of conditioning ac-
quired by any one cue will depend on how well it competes with other cues.
In the arrangement shown in Figure 3.18, a room is paired with a drug
on 50% of the trials for both drug users. Therefore, both drug users have
the same number of room-drug pairings. For the first drug user (User A),
however, an odor is always present on trials when the drug is taken, and
a friend is always present on trials when the drug is not taken. You may
notice that User A’s drug history is exactly like Group Correlated’s treat-
ment in the Wagner et al. (1968) experiment. Based on that experiment, we
would expect that the odor would acquire the most conditioning and would
block conditioning of the room. For the second user (User B), the situation
is like Group Uncorrelated. For this person, all the cues—room, odor, and
friend—are equally correlated with the drug; they are each paired with the
drug 50% of the time. In this case, the three stimuli will each acquire some
The Nuts and Bolts of Classical Conditioning   109

conditioning. Drug cravings will be controlled by completely different cues


for the two users, even though they have had very similar experiences
with the drug and the conditioned stimuli. This experiment has not actu-
ally been run with drug USs, but in principle, you may begin to see what
experiments on information value in conditioning may actually say about
real-world experiments. Users receiving the same number of exposures
to the drug, the odor, the friend, and the room may differ substantially in
which cues control drug dependence.
It is not a bad exercise to construct this kind of a scenario for any of the ef-
fects described in this chapter. Conditioning research has implications for real
life, but it is more subtle, and more interesting, than many people think it is.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. Pavlov’s basic conditioning experiment provides a method for studying
how organisms associate events that occur together in time. Today, it is
generally believed that subjects learn to associate the CS and the US.
2. Second-order conditioning, sensory preconditioning, and generalization
each provide ways in which stimuli that have never been directly associ-
ated with a US can elicit a conditioned response.
3. Most modern research on classical conditioning uses one of several
basic methods. These include eyeblink conditioning in rabbits, fear con-
ditioning in rats, autoshaping in pigeons, appetitive conditioning in rats,
and taste aversion learning in rats. The conditioning that is observed in
each of these systems has some unique characteristics, but it is interest-
ing that the same laws of learning still generally apply.
4. The success of conditioning depends on several factors. The timing of
the CS and US is important; for best conditioning, the CS should pre-
cede the US and should occur relatively close to it in time. Also, condi-
tioning is more successful if conditioning trials are spaced apart rather
than massed together. Conditioning is also best when the CS and the
US are both novel and relatively intense or salient.
5. Researchers need to distinguish between responding to a CS that
results from true conditioning and responding that results from pseu-
doconditioning or sensitization. Control groups that receive equivalent
exposure to the CS and US are usually used for this purpose.
6. Conditioned excitors are CSs that predict a US; conditioned inhibitors
are CSs that predict a decrease in the probability of a US. Inhibitors
have effects on behavior that generally oppose the effects of excitors.
To detect inhibition, it is often necessary to run summation and retarda-
tion-of-acquisition tests.
110  Chapter 3

7. Conditioned inhibition results from several different procedures, includ-


ing differential inhibition, conditioned inhibition, explicit unpairing
(or negative correlation), and inhibition of delay. Latent inhibition and
extinction do not produce CSs that satisfy the modern definition of
inhibition; for example, they fail summation tests.
8. Conditioning is not an automatic result of pairing a CS and a US.
Research on CS-US contingencies, blocking, overshadowing, and
relative validity each indicate that CS-US pairings are not sufficient to
cause learning. Instead, the CS must provide non-redundant informa-
tion about the occurrence of the US for learning to occur. This idea
has stimulated some important advances in our understanding of how
conditioning works, some of which will be discussed in Chapter 4.

Discussion Questions
1. Why is it important to have terms like CS, US, CR, and UR to describe
the critical events in conditioning experiments? What are the typical CSs
and USs used in each of the conditioning preparations described in this
chapter?
2. See if you can come up with real-world examples of the following
phenomena:
(a) Second-order conditioning
(b) Sensory preconditioning
(c) Latent inhibition (the CS preexposure effect)
(d) Conditioned inhibition
(e) Blocking
(f) Relative validity
3. Illustrate the various “things that affect the strength of conditioning” by
considering how they might influence the conditioning of fear, appetite
for food, or drug craving in humans.
4. How exactly do the various results described in the “Information value
in conditioning” section of this chapter support the idea that pairing a
CS and a US is not sufficient to produce conditioning?
The Nuts and Bolts of Classical Conditioning   111

Key Terms
autoshaping  87 differential pseudoconditioning
backward inhibition  98 95
conditioning  92 discriminative relative validity  108
bidirectional inhibition  98 retardation-of-
response excitation  97 acquisition test  99
system  100 excitor  97 S-R learning  81
blocking 104 explicitly S-S learning  81
compound unpaired  98 second-order (or
conditional generalization  81 higher-order)
stimulus  98 generalize  81 conditioning  83
conditional response goal tracking  89 sensitization  95
(CR)  80 sensory
inhibition  97
conditional stimulus preconditioning  84
inhibition of
(CS)  81 simultaneous
delay  98
conditioned conditioning  92
inhibitor  97
emotional response spaced trials  93
(CER)  86 intertrial interval  92
latent inhibition  93 stimulus
conditioned food- substitution  81
cup entry  89 magazine
approach  89 summation test  99
conditioned
massed trials  93 suppression ratio  87
inhibition  98
negative trace
conditioned
contingency  103 conditioning  92
suppression  86
negative unconditional
conditioning
correlation  98 response (UR)  80
preparations  84
ordinal prediction  99 unconditional
contingency  103
stimulus (US)  80
CS preexposure overshadowing  106
US preexposure
effect  101 positive
effect  94
delay contingency  103
conditioning  91
Chapter Outline
The Rescorla-Wagner Model  114 Short-Term Memory and
Blocking and unblocking  117 Learning 136
Extinction and inhibition  119 Priming of the US  138
Other new predictions  122 Priming of the CS  138
CS-US contingencies  125 Habituation 141
What does it all mean?  127 What does it all mean?  142
Some Problems with the Nodes, Connections, and
Rescorla-Wagner Model  128 Conditioning 143
The extinction of inhibition  128 Wagner’s “SOP” model  144
Latent inhibition  128 Sensory versus emotional US nodes  148
Another look at blocking  129 Elemental versus configural CS
nodes 150
The Role of Attention in What does it all mean?  153
Conditioning 130
The Mackintosh model  130 Summary 154
The Pearce-Hall model  132
A combined approach  134 Discussion Questions  156
What does it all mean?  135 Key Terms  157
chapter

1
4

Theories of Conditioning

T henowpreceding chapters were full of so many facts that by


it must be getting difficult to keep them straight.
Here is a possible solution: What you need is a theory.
Theories are useful in organizing and integrating facts.
For example, in Chapter 3, we considered the effects
on conditioning of how the CS and US are presented
in time (see Figure 3.8). You could memorize the sepa-
rate facts: that delay conditioning is better than simul-
taneous conditioning, that simultaneous conditioning
is better than backward conditioning, and so on. Or
you could organize them with a summary: Learning is
best when the CS can signal the US. The summary is a
theoretical idea. It is relatively easy to remember, and
from the summary, you can reassemble the facts.
This chapter focuses on theories of condition-
ing—some much more systematic ideas about how
associative learning, as it is studied in conditioning,
actually works. The development of these theories
is sometimes thought to be one of the most impor-
tant achievements in learning research in the last sev-
eral decades. The theories will probably look a little
woolly at first, but they are not as confusing as you
might think, and they are even a little fun to play with.
They are worth the trouble because they have many
implications for understanding human and animal be-
havior, and they are practical because they may help
simplify and organize the facts for you. If the only
thing you remember when you walk away from this
book is a good theory, you will be able to reconstruct
a reasonable approximation of many of the facts.
114  Chapter 4

Theories do more than just simplify and organize; they are usually cre-
ated to explain things. We need a tool that will help explain how animals
and humans behave as if they have learned to associate CSs and USs or
responses and reinforcers. We need to know how contingencies and in-
formation value get translated into knowledge and behavior. We need to
know why novelty of the US or CS is so important in allowing learning to
occur. By the end of this chapter, we will be able to explain and integrate
these disparate facts.
Theories are also important because they stimulate research, which is
a harder point to appreciate. If we want to know whether an explanation
is right or wrong, we need to run experiments to test it. A theory is only
testable if its predictions can be proved wrong—in other words, it must
be “falsifiable.” Good theories are always constructed so that they can be
falsified; if they cannot be proved false, we usually are not interested in
them. Good theories therefore lead to research. Right or wrong, they can
increase our knowledge.
This chapter begins with a theory that time has judged outstanding on
all these criteria: It simplified the facts, it explained things in a clear and
unambiguous way, and it was extremely testable. The theory has stimulated
much research, and it is safe to say that we would not know as much about
conditioning as we do today if the theory had not been formulated. It is a
theory originally published in two important papers by Robert Rescorla
and Allan Wagner (Rescorla & Wagner, 1972; Wagner & Rescorla, 1972).
In these papers, they tried to explain some of the exciting conditioning re-
sults that had been emerging at the time, such as the effects of information
value in learning. You will remember from Chapter 3 that these findings
challenged most views of learning, which tended to see conditioning as
quite boring and passive. Although findings like blocking and contingency
learning suggested that information value was important in conditioning,
the meaning of the term information value was fuzzy and vague. One goal
Robert Rescorla
of the Rescorla-Wagner model was to pin the meaning down. The model
began by providing a very concrete account of simple conditioning.

The Rescorla-Wagner Model


The Rescorla-Wagner model is all about surprise. It assumes that learning
occurs on a conditioning trial only if the US is surprising. This idea was
first suggested by Kamin’s blocking and unblocking effects (1968, 1969)
in that learning to a light did not occur if the US was already signaled by
another CS. The signal made the US predicted and not surprising. In real-
Allan Wagner ity, there is an element of surprise whenever learning does occur. Think
about the very first conditioning trial when a CS is first paired with a US.
Because nothing signals the US, it is surprising, and we get some learning.
As conditioning trials proceed, however, the CS will come to predict the
US, and the US will become less and less surprising. At some point during
Theories of Conditioning   115

1.2 Figure 4.1  The growth of asso-


ciative strength (V) to a CS as a
1.0
λ function of CS-US pairings. The
curve approaches an asymptote
Associative strength (V)

set by λ.
0.8

0.6

0.4

0.2

0
2 4 6 8 10 12 14 16 18 20
CS-US pairings

training, the CS will predict the US perfectly, and at this point, no further
learning will occur. An upper limit to learning is reached when the US is
no longer surprising.
These ideas are illustrated in Figure 4.1, which shows the growth of
“associative strength,” the strength of the CS’s (hypothetical) association
with the US, over trials. Notice that with each trial there is an increase or
jump in associative strength. On early conditioning trials, the jumps are
large—that is, each trial causes a relatively large increase in associative
strength—but the jumps decrease in size as learning progresses until the
learning curve approaches its upper limit, or “asymptote.” Rescorla and
Wagner suggested that the size of each jump depends on how surprising
the US is on the corresponding trial. On early trials, the CS does not yet
predict the US. The US is therefore surprising, and we get a big jump. On
later trials, though, when the CS has come to predict the US, the US is not
surprising, and we get no further jumps. Surprise decreases as learning
approaches its limit. Once the CS predicts the US, the US is not surprising,
and no further learning occurs.
Rescorla and Wagner gave “associative strength” a shorter name: V,
for predictive value. They suggested that V increases on each trial until
the CS predicts the US perfectly, at which point V reaches an upper limit
of conditioning that the US will allow. The asymptote—the upper limit
of the curve—is called λ (lambda). The asymptote is determined by the
magnitude of the US. On any given trial, the change in associative strength
can be determined by a very simple equation

ΔV = αβ(λ – V)
116  Chapter 4

The symbol Δ (delta) means change; α and β are fractions (they have values
between 0 and 1) that relate to the salience of the CS and US, respectively,
which we will discuss shortly. The key is the quantity in parentheses, λ –
V. This quantity describes the surprisingness of the US. Here λ is the US
term; it stands for the US. V is the learning term; it describes how well the
CS is associated with, and thus predicts, the US. The difference between
the two terms corresponds to how much bigger the US is than what the
CS predicts, and is thus a surprise. Another way to think of surprise is to
note that if the US is surprising, the CS does not predict it very well, and
there is an error in what the CS predicts. The difference between the US and
what the CS predicts (λ – V) describes the size of that error, and is therefore
sometimes called prediction error. During conditioning, as the value of V
gets bigger over trials, the CS becomes a better and better predictor of the
US, and the prediction error or difference (λ – V) gets smaller and smaller
until no further changes in associative strength occur. Conditioning thus
works, in a sense, to correct the prediction error.
Figure 4.2A illustrates how the picture changes if we use USs of differ-
ent magnitudes. The model assumes that larger USs mean larger λs. The
bigger the λ, the higher the asymptote that learning reaches. This assump-
tion is consistent with the effects of US magnitude described in Chapter
3: The bigger the better. Figure 4.2B illustrates what happens as we look
at CSs with different saliences. Salience affects α, a fraction with a value

(A)
2.5

λ2
2.0
Associative strength (V)

(B)
1.5 1.2
α2
Associative strength (V)

λ
1.0
λ1
1.0 0.8 α1
0.6
0.5 0.4
0.2
0 0
2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16
CS-US pairings CS-US pairings

Figure 4.2  (A) The effect of US magnitude (λ) on learning. The bigger the
US, the higher the asymptote. (B) The effect of CS salience (α) on learning.
The more salient (i.e., intense) the CS, the faster the learning approaches
the asymptote, which is still determined by US magnitude (λ).
Theories of Conditioning   117

between 0 and 1. On each trial, the quantity multiplies the surprise fac-
tor in the equation; therefore, the bigger the α, the bigger the size of each
jump. Notice that α affects how quickly the learning curve approaches its
maximum, but it does not affect the maximum itself. That is always set by
λ, the magnitude of the US.
Thus far, the model is merely a description of the learning curve (see
also Bush & Mosteller, 1955). But Rescorla and Wagner added a simple but
important twist that had far-reaching implications. They proposed that the
degree to which the US is predicted on a trial depends not on any single
CS, but on all the CSs that are present on that trial. Thus, if two CSs are
presented together in a compound, they both contribute to predicting the
US. To capture this idea, Rescorla and Wagner suggested that the extent
to which the US is predicted is described by the sum of the V values of all
stimuli present on a given trial. Thus,

ΔV = αβ(λ – ΣV)

where Σ means “sum of.” The key is actually the difference between λ
and the summed value of all stimuli present on the trial. With this simple
equation, one can go a remarkably long way in describing and predicting
the results of conditioning experiments.

Blocking and unblocking


Let us first return to Kamin’s blocking effect. In the blocking experiment,
a noise and a US were first paired, and then a light-noise compound was
paired with the same US. No learning occurred to the light. The model’s
account of this effect is simple and “elegant”—a term we use to mean
“disarmingly simple and pretty.” During conditioning in the first phase,
the noise acquires a positive V value. In fact, Kamin conditioned it to about
the asymptote. Let us assume that Kamin used a US with a λ equal to 1.
At the end of the first phase, the rat has thus learned that the associative
strength of the noise (VN) is

VN = 1.0

During the second phase, conditioning trials continue, but a light is now
added to the noise. To find out what the model predicts will happen to the
light, we simply plug numbers into the equation

ΔVL = αβ(λ – ΣV)

where ΣV will equal the total associative strength on these trials, or the
values of both the noise and the light, or

ΔVL = αβ[λ – (VN + VL)]


118  Chapter 4

Because the light is new to the rat, the light has no associative strength,
and its initial V value, VL, equals 0. I will also assume that αβ = 0.2. Putting
these numbers into the equation gives us

ΔVL = 0.2[1.0 – (1.0 + 0)] = 0

The model predicts no change in associative strength (ΔV) to the light.


Notice that this prediction occurs because there is no surprise or prediction
error on the compound trials: The quantity λ – ΣV equals 0. The model is
loyal to the “surprise” interpretation of blocking.
Kamin’s unblocking result was equally important. In that experiment,
Phase 1 conditioning occurred as before so that the noise was first associ-
ated with a shock. Then, on the crucial compound trials when light was
added, the intensity of the US was increased. The increased US led to
learning about the light. The model accounts for this following the same
strategy used in blocking. The key here is that the larger US in the second
phase has a bigger λ value.
As before, let us assume that the noise is first conditioned to V = 1; that
is, it is originally conditioned to the asymptote. Now, on the compound
trials, we solve for ΔVL as before:

ΔVL = αβ[λ – (VN + VL)]

But now we have a new and larger US that will support more conditioning
than the previous US. Because we used a value of λ = 1 before, we must use
a larger number this time. I will use a value of 2, but any number greater
than 1 will do:

ΔVL = 0.2[2.0 – (1.0 + 0)] = + 0.2

The model correctly predicts an increase in associative strength to the light


on the first trial when the US intensity is increased in Phase 2 of the block-
ing design.
The numbers I used in the model above are somewhat arbitrary, which
works very well as long as one sticks to certain rules. In particular, one
must be consistent in assigning λ values. The rule, again, is that larger
USs require larger λs. But note that because the numbers plugged into
the equations are arbitrary, the numbers that result must be arbitrary, too.
The model is not designed to predict the numbers of drops a dog salivates,
the amount of time a rat spends freezing, or the number of times a pigeon
pecks a key light. Instead, the model answers questions like “which group
will acquire more conditioning?” It predicts how to rank order groups. In
the illustration of blocking and unblocking, we did not arrive at a value of
“fear” (or conditioned suppression) in Kamin’s rats. What we found was
that increasing the US in Phase 2 (the unblocking procedure) will produce
more conditioning than using the same US (the blocking procedure).
Theories of Conditioning   119

Extinction and inhibition


The model provides a ready account of extinction. It uses the same equa-
tion as before; the only difference is that presenting the CS with no US
is the same as running conditioning trials with a US of zero intensity. In
keeping with the US magnitude rule, we now use 0 as the value for λ.
Suppose that we have paired our noise with the US enough times that
the CS is V = 1. Now, when extinction trials begin, we introduce a new
value for λ, 0. So,

ΔVN = 0.2(0 – 1) = –0.2

Because ΔV solves to a negative number, the model predicts that V will


decrease on the extinction trial. On the next trial, we insert a reduced value
for V; ultimately, V will decrease on each trial until there is no more sur-
prise and it approaches a new asymptote of 0. This situation is shown in
Figure 4.3. The animal undergoing extinction adjusts its “expectation” so
as to accurately predict the new trial outcome. The model assumes that V
returns to a value of 0 during extinction.
The only new idea is that when a trial involves no US, we simply
use λ = 0 in the original equation. This trick has important implications
for conditioned inhibition. One of the easiest ways to train inhibition is
Pavlov’s conditioned inhibition paradigm. Here, a light might be paired
with a US, but paired with no US when it is compounded with a noise:
The noise becomes an inhibitor. The model predicts this result readily, and
believe it or not, you already have all the information you need to know!

1.0
Conditioning Extinction

0.8
Associative strength (V)

0.6
λ=1 λ=0
0.4

0.2

0
2 4 6 8 10 12 2 4 6 8 10 12 14
Trials

Figure 4.3  Conditioning (left) and extinction (right) in the Rescorla-Wagner


model. In extinction, the model assumes that associative strength approaches a
new asymptote so that the CS now predicts no US. When there is no US, λ = 0.

Q: I made dividing line dashed 100%K to match earlier figure 02.14


and the white was getting lost in the vertical grid lines.
120  Chapter 4

Figure 4.4  The conditioning of 1.2


inhibition in the Rescorla-Wagner Phase 1 Phase 2
model. At left, a light CS is as- 1.0
sociated with a US; its associative 0.8
strength therefore approaches a VL

Associative strength (V)


value of λ = 1. In Phase 2, the light 0.6 VL
is compounded with a noise CS, and
0.4
the compound is presented without
a US (LN – No US). The value of λ 0.2
ΣV
is now 0; therefore, the sum of the
associative strengths of L and N (ΣV) 0
will approach 0 over trials in the sec- –0.2
ond phase. Doing so requires that
VN
VL decrease a bit from 1 and that VN –0.4
decrease from 0. When the associa-
–0.6
tive strength of a CS goes below 0, 2 4 6 8 10 2 4 6 8 10 12
it becomes an inhibitor. L — US LN — No US
Trials

The easiest way to understand conditioned inhibition is to break Pav-


lov’s experiment into the two phases shown in Figure 4.4. During Phase
1, assume that the light is paired with the US enough to acquire some as-
sociative strength. In fact, let’s assume that it reaches the asymptote, with
a value of λ or 1. In the second phase, we then add the noise to the light
and “pair” this compound with no US. To see the model’s prediction of
what will happen on the first compound trial, we use the usual equation
to solve for ΔVN:

ΔVN = αβ[λ – (VL + VN)]

Remember that trials with no US have a λ of 0 (a US of zero magnitude).


Furthermore, in this example, we have trained the light to have a value
of V = 1. Thus,

ΔVN = αβ[0 – (1 + 0)] = –0.2

The model predicts that the noise will decrease in strength. But because it
starts with a value of 0, it will have to decrease below zero; therefore, V N
will become negative. This is how the model defines inhibition: An inhibitor
is a CS with a negative V value.
Given this information, the picture of conditioning is now complete.
Conditioned inhibitors have negative V values, while excitors have positive
V values. The system preserves the idea that excitation and inhibition are at
opposite ends of one continuum (e.g., Rescorla, 1967b). It also allows for the
known effects of compounding excitors and inhibitors that we discussed in
Theories of Conditioning   121

Chapter 3. Recall that performance to compound stimuli is usually a func-


tion of the sum of the elements—excitors add to one another but inhibitors
subtract. By giving excitors and inhibitors positive and negative numbers,
respectively, the model preserves this idea. In general, performance to a
compound stimulus will be the summed value of the compounded CSs or,
in terms of the model, ΣV.
Figure 4.4 actually shows that the values of both the noise and the
light will change during the Phase 2 compound trials. This is because both
elements in the compound are paired with some surprise, and both may
therefore change. To solve for changes to VL, we follow the usual rules. In
fact, the equations and the numbers we enter into them are the same as
that for the noise. So just as VN decreases, so does VL. The main difference
is that whereas the decreases in VN make the neutral stimulus drop to a
value less than 0, the decreases in VL bring it down from the value of 1,
learned during the first phase.
A closer look at Figure 4.4 actually reveals an interesting—and clinically
relevant—further prediction. Notice that as the compound extinction trials
continue, there is a point at which the model predicts no further changes
to VN or VL. In the long run, an infinite number of compound trials beyond
those shown above will lead to no further decrease in the V values. If we
were interested in extinguishing fear to a phobic stimulus like the light,
this fact is relevant because it tells us that certain extinction procedures
will not be effective in making the excitor truly neutral (VL never reaches
0). Basically, the rule is that if an inhibitor is presented in compound with
the excitor during extinction, the inhibitor may “protect” the excitor from
total associative loss. The prediction of protection from extinction has
been confirmed in experiments by several investigators (Lovibond, Davis,
& O’Flaherty, 2000; McConnell & Miller, 2010; Rescorla, 2003; Soltysik et
al., 1983). It is clinically relevant; clinicians may extinguish excitors in the
presence of cues that can become inhibitory during exposure therapy. If
we were to eventually test the excitor on its own—away from the influ-
ence of the inhibitor—we would observe fear performance again. This is
a possible reason for relapse.
To understand the prediction, it is helpful to remember that the Rescor-
la-Wagner equation embodies the concept of surprise. In the example we
are considering, both VL and VN decrease until their combined predictive
value—ΣV—equals zero. At this point, the inhibitory noise predicts “no
US” about as much as the excitatory light predicts the US. The two predic-
tions cancel each other out, and the compound therefore predicts nothing.
When nothing occurs at the end of the trial, there is no surprise, and no
change in associative strength occurs.
Protection from extinction has an interesting flip side. Just as presenting
an inhibitor in compound with an excitor can “protect” the excitor from
associative loss in extinction, presenting another excitor can facilitate it!
Suppose that we ran another experiment so that a light and a buzzer were
initially conditioned to have V values of 1. For one group, we then put
122  Chapter 4

the light and buzzer together and present the compound without the US.
According to the equation, the loss in VL should be

ΔVL = αβ[λ – (VL + VB)]

In the next equation, I have already substituted the V values of the light
and buzzer for V. Using λ = 0 for these extinction trials, the equation
solves to

ΔVL = 0.2[0 – (1 + 1)] = –0.4

What does the resulting number mean? Essentially, it means that the de-
crease caused by the light and buzzer together will be greater than if the
light were extinguished alone. If a second group received extinction trials
with the light only (no buzzer), the result would be

ΔVL = 0.2[0 – (1)] = –0.2

The model thus predicts a greater decrease when an excitor is extinguished


in compound with another excitor. The prediction was confirmed by Wag-
ner, Saavedra, and Lehmann (see Wagner, 1971) and more recently by oth-
ers (e.g., Culver, Vervliet, & Craske, 2015; Leung, Reeks, & Westbrook, 2012;
Rescorla, 2000, 2006a). The point is this: If you want extinction trials to be
truly effective, put your CS in compound with other excitors (not inhibi-
tors) during extinction.

Other new predictions


The fun thing about the Rescorla-Wagner model is that odd predictions
practically jump out of it once you learn to play with it. It is worth a try—
you might like it!
The top line of Figure 4.5A illustrates an experiment run by Kremer
(1978; see also Lattal & Nakajima, 1998; Rescorla, 1970, 2006b, 2007) that
would not have seemed very interesting before the Rescorla-Wagner model.
During the first phase of this experiment, a light and a noise were sepa-
rately paired with the US on many trials. This allowed the value of each
stimulus to approach the asymptote. In the second phase, Kremer put them
together in compound and continued to pair them with the US. What does
the model predict? It is actually somewhat unusual in that it predicts that
both the light and the noise will lose associative strength, even though they
continue to be paired with the US!
To understand the prediction, we merely use the same tools we have
used before. Assume that both light and noise are conditioned to the as-
ymptote in Phase 1. If we used our usual US with a value of λ = 1, then
at asymptote both VL and VN would equal 1. To predict changes that will
occur during Phase 2, we use the usual equation. For now, let’s focus on
the noise:
Theories of Conditioning   123

(A)
Phase 1 Phase 2 Test
L — Shock, N — Shock LN — Shock L?, N?
L — Shock, N — Shock LNX — Shock X?

(B) (C)
2.5 2.5
Phase 2: LN — US Phase 2: LNX — US

2.0 2.0
Associative strength (V)

Associative strength (V)


1.5 1.5

ΣV ΣV
1.0 1.0
VL VL

0.5 0.5 VN
VN
0 0
1 2 3 4 5 6 7 8
Trials VX

–0.5
1 2 3 4 5 6 7 8
Trials

Figure 4.5  “Overexpectation” of the US. (A) Designs of Kremer’s (1978) two ex-
periments. (B) Rescorla-Wagner predictions for what will happen when the com-
pound LN is paired with a US after L and N have each been separately associated
with the US and learning has reached the asymptote (VL = VN = λ = 1). In Phase 2,
the summed strengths of L and N (ΣV) “overexpect” the US, and ΣV will there-
fore decrease over trials until its value is equal to λ. Doing so requires that the
associative strengths of L and N decrease. (C) Rescorla-Wagner predictions for a
similar experiment in which a CS that starts with a value of 0 associative strength
(CS X) is added to the compound. The model predicts that X will become an
inhibitor (VX will drop below 0), even though it is always paired with the US.

ΔVN = αβ[λ – (VL + VN)]

We continue to use the same US as before, with its λ value of 1.0. Given the
V values for light and noise that were established during the first phase,
the result is

ΔVN = 0.2[1.0 – (1.0 + 1.0)] = –0.2


124  Chapter 4

The noise’s associative strength will therefore decrease during this phase!
In effect, the noise and light together “overpredict” the US, and the noise’s
associative strength is reduced accordingly. This surprising prediction
was confirmed by Kremer (1978) as well as by Lattal and Nakajima (1998)
and Rescorla (1970, 2006b, 2007). It is known as the overexpectation
effect.
In fact, both noise and light will decrease in strength during the second
phase; this is because both stimuli are present on these trials, and both—in
principle—are subject to the same forces. Figure 4.5B actually illustrates
the model’s prediction for both stimuli during Phase 2. Notice that the
value of neither stimulus returns to 0; do you see why? It is because the
changes will stop occurring only when the summed value of the noise and
light (ΣV) equals λ (1.0). At this point, what is predicted (ΣV) and what
actually occurs (λ) are perfectly in line with each other. There is no predic-
tion error. Once these quantities are equal, there is no more surprise, and
no further changes will occur.
Kremer (1978) actually took the model a step further. In another experi-
ment (see Figure 4.5A, second row), he ran the same Phase 1 but added a new,
almost perverse, twist when he began Phase 2: Instead of only presenting
light and noise with the US, Kremer added a third CS, X, with no previous
training. This stimulus was completely neutral at the beginning of Phase 2.
The model makes a cool prediction here: Even though X is always paired
with a US, the model predicts that it should become a conditioned inhibitor!
Arriving at the prediction is easy. As before, light and noise are both
trained to an asymptote value of λ = 1.0. The US in Phase 2 is the same,
and thus also has a λ value of 1.0. The new stimulus, X, starts with a value
of zero because it is a new stimulus with no previous history of condition-
ing. The only new trick is that ΣV now requires summing the V values of
all three stimuli: L, N, and X. To find out what happens to X, we expand
on the familiar equation:

ΔVX = αβ[λ – (VL + VN + VX)]

Substituting the numbers from above, the equation becomes:

ΔVX = 0.2[1 – (1 + 1 + 0)] = –0.2

The model predicts that the value of X will decrease (as L and N did in
Kremer’s previous experiment). But because X starts at 0, the decrease will
drive X into the realm of negative numbers; it will thus become a condi-
tioned inhibitor. Kremer (1978) confirmed this prediction in a conditioned
suppression experiment with rats.
Figure 4.5C shows the changes predicted with all three stimuli used
during Phase 2. We now see that our previous description of a conditioned
inhibitor as a stimulus that “predicts no US” is not quite correct. In this
experiment, X was always paired with a US, so how can we say that it
Theories of Conditioning   125

predicts no US? A better description may be that an inhibitor signals that


an upcoming US is not as strong as other cues on the trial predict it to be.
For many people, the symbols and numbers of the model are actually easier
to comprehend and remember than the verbal description.
Let’s pause and take stock of what we have learned thus far. In our dis-
cussion of the Rescorla-Wagner model, chaos has practically broken loose.
Kamin’s early work told us that CS-US pairings are not sufficient to cause
learning, but that turns out to be only part of the story. CS-US pairings can
actually reduce associative strength—or even cause conditioned inhibition.
And, extinction trials may not always be sufficient to cause extinction or
to produce conditioned inhibition. The model has given us a glimpse of a
brave new world. The research stimulated by the Rescorla-Wagner model
shook up our previous understanding of—and common intuitions about—
Pavlovian conditioning. In fact, the model has helped change the way we
think about this deceptively simple learning process.

CS-US contingencies
The Rescorla-Wagner model also offered a precise way to think about the
experiments suggesting a role for a CS-US contingency in conditioning
(e.g., Rescorla, 1966, 1967b, 1968b; see Chapter 3). It also clarified the con-
nection between CS-US contingency and other findings on information
value. The procedures used in these experiments are summarized again in
Figure 4.6. Remember that a negative CS-US contingency—where the US is
more probable without the CS than with it—leads to inhibitory condition-
ing of the CS, whereas a zero CS-US contingency leads to zero learning.
As noted by Papini and Bitterman (1990), one superficial explanation of
these findings has been embraced by some textbooks: Perhaps the animal
learns or directly appreciates the correlation between the CS and the US.
As was suggested in Chapter 3, however, there are less grandiose ways to
think about what is learned in the contingency experiment. The Rescorla-
Wagner model offers one of these explanations.
The model explains contingency effects in a remarkably straightforward
way. Rescorla and Wagner pointed out that the CS and US are not presented
in a vacuum in the contingency experiment. Instead, they always occur
together with stimuli that are always present in the background. These
background stimuli could include odors or sounds or visual aspects of
the room, apparatus, or box that the subject is in while it is exposed to the
CS and US. Such stimuli are called contextual stimuli, or context, and
we must always presume that the animal can learn about these cues in
addition to the ordinary CS. In fact, there is good evidence that animals
and humans do learn to associate these background stimuli with the CS
or US during conditioning. The key idea was that these contextual stimuli
may serve as a single, long-lasting CS that is present whenever the CS and
US are presented. When the US is presented in the absence of the CS, it is
therefore paired with the context. And, whenever the CS itself is presented,
it is always presented in compound with the context.
126  Chapter 4

Negative contingency

(A) CS

Zero contingency

(B) CS

Figure 4.6  (A) A negative contingency between a CS and a US (the US is less


probable in the presence of the CS than in its absence). (B) Zero contingency
between a CS and a US (the US is equally probable in the presence and absence
of the CS).

Once we recognize the presence of contextual stimuli, all sorts of things


begin to happen. First, consider the negative contingency situation as it is
sketched in Figure 4.6A. The model tells us that there are (basically) two
kinds of trials during the session: (1) trials when the CS is presented (but
those are really CS-context compound trials) and (2) trials during the in-
tervals between CSs, when the context is presented alone. In the negative
contingency case, the CS-context compound is not paired with the US, but
the context is paired with the US on its own. If we call the CS “X” and the
context “A,” we see that the negative contingency case is a new example
of Pavlov’s venerable conditioned inhibition paradigm:

A — US, AX — No US

We have already seen that the model has no trouble accounting for inhibi-
tion developing in this situation (see Figure 4.4).
The zero contingency case (Figure 4.6B) boils down to a very similar
argument. Once again, there are two kinds of trials: those in which the CS
and context occur together, and those in which the context occurs alone.
Here, however, the CS-context compound is sometimes paired with the
US. What are we to make of that? In the long run, the two trial types can
be thought of as AX-US and A-US, respectively; to simplify, the zero con-
tingency case boils down to another example of the blocking paradigm.
On trials in which the CS is paired with the US, the US is not surprising
because it is already predicted by the context.
By pointing to the possible role of contextual cues, the model provides
an elegant description of contingency experiments (see Rescorla, 1972). Fur-
thermore, its account of these effects was tested and supported (e.g., Baker,
1977; Dweck & Wagner, 1970; Rescorla, 1972). The strategy was to use one
of several methods to reduce contextual conditioning that occurred during
contingency training. When that was done, the effects changed accordingly.
For example, when contextual conditioning is reduced during negative
contingency training, little inhibition is acquired by the CS (Baker, 1977).
Theories of Conditioning   127

The effects of CS-US contingency remain an active research question


to this day, however. There are other ways to view contingency effects
(e.g., Gallistel & Gibbon, 2000; Gibbon & Balsam, 1981; Miller & Schacht-
man, 1985; see Durlach, 1989, for one review). For example, comparator
theories also propose that contingency effects result from conditioning
of the context (Gibbon & Balsam, 1981; Miller & Schachtman, 1985), but
they give that conditioning a different role. Once again, the subject mainly
learns two things: that the CS and US are associated and that the context
and US are associated. (The CS and context are also associated, but that
can be ignored for now.) These CS-US and context-US associations are
then compared to determine the level of responding to the CS. If the CS’s
association with the US is weaker than the context’s association with the
US, the subject will not respond to the CS. In the zero contingency case, the
CS’s strength is the same as the context’s strength, and no responding is
observed. In the case of a negative contingency, the CS’s strength is weaker
than the context. Notice that the animal makes the crucial comparison after
all the learning has already occurred. The comparison thus determines
performance, not actual learning about the CS. One unique prediction of
this view is that changing the strength of the context’s association after CS
conditioning is finished should modify responding to the CS. If the con-
text’s strength is weakened after either type of conditioning (for example,
by extinguishing it), the subject should respond more to the CS. This type
of prediction has been confirmed by a number of experiments in Ralph
Miller’s laboratory (Kasprow, Schachtman, & Miller, 1987; Schachtman et
al., 1987; see also Matzel, Brown, & Miller, 1987; but see Dopson, Pearce,
& Haselgrove, 2009; Holland, 1999). Unfortunately, the complementary
prediction that strengthening the context should weaken responding to a
CS has not been confirmed (e.g., Miller, Hallam, & Grahame, 1990). Exten-
sions of comparator theory (Denniston, Savastano, & Miller, 2001; Stout
& Miller, 2007) have nonetheless attempted to address issues like this one
Ralph R. Miller
and have made a number of new predictions that have been confirmed in
Miller’s laboratory (e.g., Blaisdell et al., 1998; Friedman et al., 1998; Urcelay
& Miller, 2006; see Miller & Witnauer, 2015, for a recent review).
Interestingly, nearly all the views of the effects of contingency have
proposed a role for conditioning of the context. In fact, beginning with
the Rescorla-Wagner model’s account of contingency effects, the role of
context in conditioning in general has become an active area of research
(e.g., Balsam & Tomie, 1985; Bouton & Nelson, 1998b). We will look at
some more of this research later in this chapter and in Chapter 5 as well.
To a large extent, a considerable amount of knowledge gained in this area
would have been delayed had it not been for the Rescorla-Wagner model.

What does it all mean?


The Rescorla-Wagner model provides a very successful account of compound
conditioning. Not only is it good at providing a parsimonious account of the
128  Chapter 4

results of compound conditioning experiments, but by pointing to the pos-


sible general role of context, it emphasizes that conditioning probably always
involves the conditioning of compounds. In a way, that is one of the most
important messages to take home about the Rescorla-Wagner model. The
general points are that (1) conditioning always involves compound stimuli
and (2) the increment or decrement in conditioning to any individual element
depends importantly on what the other elements of the compound predict.
Models like the Rescorla-Wagner model begin to give us a sophisticated
handle on the probable complexity of learning in the real world.

Some Problems with the Rescorla-Wagner Model


Although the Rescorla-Wagner model is a powerful theory, in many ways
it is only a first step toward a complete account of conditioning. The model
has several well-known shortcomings (e.g., Miller, Barnet, & Grahame, 1995).

The extinction of inhibition


The Rescorla-Wagner model makes another prediction that runs counter
to all intuitions. It concerns how we might extinguish learned inhibition.
The model claims that if a conditioned inhibitor is merely presented in
extinction trials without a US (λ = 0), such trials should remove inhibition.
On these trials, the inhibitor—with its negative V value—should lose its
value until it returns to zero. That is,

ΔVN = 0.2[0 – (–1)] = +0.2

The inhibitor should “gain” strength or, more precisely, lose its negative
value until it reaches zero. The prediction is interesting because our intu-
itions about inhibition seem to predict the opposite. Why should presenting
a CS signaling no US, without a US, get rid of its signal value? Here our
intuitions turn out to be more accurate than the model. Repeated presenta-
tions of an inhibitor without the US do not decrease inhibition (e.g., DeVito
& Fowler, 1986; Witcher & Ayres, 1984; Zimmer-Hart & Rescorla, 1974);
related predictions also failed (e.g., Baker, 1974). This sort of result suggests
that there may be something wrong with the way the Rescorla-Wagner
model treats inhibition. Perhaps inhibition is not exactly the symmetrical
opposite of excitation, as the model implies by giving excitors and inhibi-
tors positive and negative numbers separated by the central value of zero.

Latent inhibition
In Chapter 3, we saw that if a subject is preexposed to a CS before condi-
tioning begins, conditioned responding appears relatively slowly during a
conditioning phase. The problem is that the model has no way to account
for this effect. During the preexposure trials, no US occurs and no US is ex-
pected. There is no basis for expecting a change in the value of V or, indeed,
a change in anything. Latent inhibition is considered to be an extremely
Theories of Conditioning   129

important effect, and this failure is viewed as a serious shortcoming of the


Rescorla-Wagner model.

Another look at blocking


The blocking effect is central to the Rescorla-Wagner model’s emphasis
on the surprisingness of the US. Blocking is said to occur because the US
is ineffective; it has been rendered unsurprising. But during the 1970s, a
series of experiments by the late Nicholas Mackintosh and his associates
raised some questions about this account (see Mackintosh, 1978, for a
review).
Consider the experiment sketched in Figure 4.7A (Mackintosh & Turn- Nicholas
Mackintosh
er, 1971). Both groups received initial conditioning with a noise CS until the
asymptote was reached. Later (Phase 3), they received an LN compound
CS paired with a larger shock. Based on Kamin’s results, L should have
acquired some conditioning—it predicted something new. The difference
between the groups, however, occurred in between Phases 1 and 3. A con-
trol group (Group 1) received nothing, but the experimental group (Group
2) received several trials in which the LN compound CS was paired with
the original US. By now you know exactly what the Rescorla-Wagner model
predicted here. If N had already been conditioned to the asymptote, there
should have been no change in the value of V for either CS. The model
predicted that nothing would happen during Phase 2. In fact, the groups
should not have differed in this experiment.
What makes this experiment interesting, though, is that the groups did
differ in how much conditioning they eventually acquired to the light. As
shown in Figure 4.7B, the control group did learn about the light—pair-
ing the compound CS with a larger shock in Phase 3 allowed learning to
happen because of unblocking. But the unblocking effect was missing in
the experimental group. Evidently, the LN-shock trials had caused some-
thing to happen that interfered with the learning that was subsequently

(A) (B)
Group Phase 1 Phase 2 Phase 3 Test
0.3
1 N — Shock ——— LN — SHOCK!! L?
2 N — Shock LN — Shock LN — SHOCK!! L?
Supression ratio

0.2

Figure 4.7  (A) Design of the experiment by Mackintosh 0.1


and Turner (1971). (B) Results of the test trials. Exposure to
LN — shock trials in Phase 2 made it more difficult to learn
about the light in Phase 3. Remember that with the suppres-
0
sion ratio, lower scores indicate more conditioning. (B, after 1 2
Mackintosh & Turner, 1971.) Group
130  Chapter 4

possible in Phase 3. Mackintosh and Turner (1971) suggested that dur-


ing Phase 2, the animals recognized that L was a redundant predictor of
shock. Because of its redundancy, they learned to pay less attention to
it. Therefore, during Phase 3, there was no learning to the light because
the rats had already begun to ignore it. Consistent with this conclusion,
humans tested in related experiments with visual CSs learn to gaze less
at the blocked CS, as if they are learning not to pay attention to it (e.g.,
Beesley & Le Pelley, 2011).
Results like these suggest a second important thing that might occur in
the blocking experiment. The Rescorla-Wagner model merely emphasizes
that the US is ineffective because it is not surprising, but the Mackintosh
and Turner result suggests that animals may learn to ignore redundant
predictors of the US. Mackintosh recognized that this idea could provide a
completely different explanation of blocking. Blocking may have occurred
because the learning mechanism may have detected the redundancy of the
noise and then tuned it out. Instead of emphasizing the ineffectiveness of
the US (the approach taken by the Rescorla-Wagner model), it is possible
to emphasize the ineffectiveness of the CS. Let us look at how attention to
the CS could affect conditioning.

The Role of Attention in Conditioning


It seems obvious that learning only happens if we are paying proper atten-
tion. In fact, there is simply too much information going on in the environ-
ment for a subject to take notice of—and attend to—at any one time. Several
British psychologists have been interested in the fact that this sort of factor
must be important—even in simple conditioning. The basic idea is that the
amount of associative strength that is learned on any conditioning trial de-
pends on how much attention is paid to the CS. In addition, the attention
paid to a CS depends in part on how well the CS predicts its consequences.

The Mackintosh model


Mackintosh presented a model of attention and conditioning that was con-
sidered an alternative to the Rescorla-Wagner model (Mackintosh, 1975a).
In general, the amount of attention a subject will pay to a CS depends on
how well the CS predicts a US. If the CS is a good predictor of a US, the
subject will pay attention to it. But if the CS is no better at predicting the
US than other CSs that are also present on a trial, attention to it will decline.
The subject attends to stimuli in its environment if those stimuli are useful
in predicting biologically significant events.
Mackintosh put this plausible idea into the familiar vocabulary. Condi-
tioning trials were assumed to result in increases or decreases in associative
strength according to the equation

ΔV = αβ(λ – V)
Theories of Conditioning   131

But conditioning trials also cause changes in attention to the CS, which
Mackintosh linked to the term α. The effects of αs of different sizes were
illustrated in Figure 4.2. If α (attention to the CS) is high, the amount of
learning that occurs on any trial will be high. If α is low, the amount of
learning will be low. Notice also that if α is 0—that is, if there is no atten-
tion paid to the CS—the equation will solve to a value of 0, and there will
be no learning at all to the CS on that trial.
Mackintosh (1975a) argued that α increased to a given CS on a con-
ditioning trial if the CS was the best predictor of the US on that trial. In
contrast, α to a CS decreased if the CS was no better than the others at
predicting the US. (It is not necessary to consider the equations that repre-
sented these ideas.) As conditioning proceeds, the subject pays more and
more attention to the best predictor of the US and less and less attention
to the weaker predictors.
One advantage of Mackintosh’s model was that it explained latent inhibi-
tion. When the CS is presented without a US during the preexposure phase,
the value of α to the CS was predicted to go down. The CS is no better than
background contextual cues at predicting no US. When the CS is paired with
the US in a second phase, the value of α starts quite low, making the incre-
ments in associative strength that happen on any trial quite small.
Naturally enough, the Mackintosh model also handled effects like the
Mackintosh and Turner result (1971) (see Figure 4.7). During the second
phase of that experiment, the experimental group (Group 2) received a
light-noise compound that was paired with shock. On the first trial, the
subject attended to both the CSs, and some learning to each CS did occur.
But the noise had already been established as a good predictor of the US in
the first phase. Its α value therefore remained high. On Trial 1, though, the
light was recognized as a worse predictor of the US, and because of that,
the value of α for the light CS decreased. As a result, by the time Phase 3
came around, the value of α to the light was low enough to interfere with
good learning in that phase.
The Mackintosh model’s explanation of blocking is different from that
of the Rescorla-Wagner model. According to the former, the US was per-
fectly capable of causing learning to both the light and the noise; the im-
portant thing was that after the first compound trial, the animal paid less
attention to the light. Because it takes one trial to learn which CSs are the
best predictors, the value of α can only change after the first conditioning
trial. In contrast to the Rescorla-Wagner model, the Mackintosh model
thus predicted normal learning about the noise on the first blocking trial.
Mackintosh (1975b) reported normal learning to the noise on the first trial
(consistent with his prediction). On the other hand, others have reported
fairly complete blocking on the first trial, in support of the Rescorla-Wagner
prediction (e.g., Balaz, Kasprow, & Miller, 1982).
Unfortunately, Mackintosh’s idea that attention should increase for
good predictors has not always fared very well (see Le Pelley, 2004, for one
132  Chapter 4

(A) (B)
Group Phase 1 Phase 2 0.6
1 T — Shock T — SHOCK!! Group 1
0.5 Group 2
2 L — Shock T — SHOCK!!

0.4

Supression ratio
0.3

0.2
Figure 4.8  (A) Design of experiment by Hall and
Pearce (1979). (B) Results of conditioning during 0.1
Phase 2. Group 1, which had previously learned to
associate the CS with a small shock, was slower to
0
learn about it in Phase 2, even though the shock 1 2 3 4 5
had previously been a good predictor of a US. (B, Last trial Blocks of two trials
after Hall & Pearce, 1979.) of Phase 1

review). Figure 4.8 shows an experiment by Geoffrey Hall and John Pearce
(1979). In this experiment, one group received 66 initial conditioning trials
with a tone and a weak shock before conditioning with a stronger shock in
a second phase. A second group received conditioning with different shocks
in the two phases, but for this group, the tone CS was not used until the
second phase. We are interested in the rate of conditioning to the tone in the
second phase. According to Mackintosh, Group 1 should learn especially
quickly: During Phase 1, the tone was established as a good predictor of a
US, and the value of α upon entering Phase 2 should be very high. At the
start of Phase 2, the animals should be paying a great deal of attention to
the tone and should therefore learn about it very rapidly.
What Hall and Pearce found, however, was quite striking. Instead of
causing rapid conditioning, conditioning of the tone in Phase 1 caused the
animals to learn about it more slowly during Phase 2! The initial learning
caused negative transfer with learning in Phase 2, which is a phenom-
enon known as Hall-Pearce negative transfer. If the value of α had in fact
John Pearce changed for the CS during Phase 1, we must assume—quite contrary to
Mackintosh—that it actually decreased as the CS became a good predictor
of the US. Mackintosh had convinced everyone that attention to the CS is
important in classical conditioning, but his rules for how attention changes
as a function of conditioning did not appear to be quite right.

The Pearce-Hall model


Pearce and Hall therefore proposed their own rule for how attention
changes during the conditioning of a CS (Pearce & Hall, 1980). Their idea
contrasts sharply with that of Mackintosh, although it seems just as plau-
Geoffrey Hall sible intuitively. (This is a good argument against trusting your intuitions
Theories of Conditioning   133

in psychology.) They suggested that an animal should not waste mental


effort paying attention to a CS whose meaning is already well understood.
Instead, mental effort—attention—should be applied to those CSs whose
meanings are not yet understood. Once the subject has learned what the CS
predicts, it should pay less attention to it. Responding to the CS will tend
to occur automatically (see Schneider & Shiffrin, 1977; Bargh & Chartrand,
1999), without the animal paying much further attention.
Like Mackintosh, Pearce and Hall represented attention with the symbol
α. In their model, however, the value of α on any given trial was deter-
mined by how surprising the US was on the preceding trial. (If the US is
surprising, the CS is not well understood, which increases attention on
the next trial. The quantity λ – ΣV, prediction error, was used again to
represent the degree of surprise.) If the US was surprising, say, on Trial
22, the value of α would be high on Trial 23. In contrast, if the US was not
surprising on Trial 22, the value of α would be low on the next trial. The
value of α on any particular trial further determined how much would be
learned on that trial.
To test these ideas, Kaye and Pearce (1984) ran some experiments in
which rats were conditioned with a small light CS mounted on the wall
several inches above the floor of the conditioning chamber (Figure 4.9A).
When the light was turned on, the rats would move near it and sometimes
touch it with their noses and paws. If the rats performed this orienting
behavior, it seemed reasonable to suppose that they were paying atten-
tion to the light. Figure 4.9B shows how this response changed during

(A) (B)
80
Orientation towards light (%)

60 Partial

40
Continuous
20
None

0
Blocks of six trials
Pre-exposure
Figure 4.9  (A) A rat orienting toward a light CS. (B) Orienting toward the light CS
in an experiment in which the light was paired with a US on every trial (Continu-
ous), never paired with the US (None), or paired with a US unpredictably 50%
of the time (Partial). Consistent with the Pearce-Hall model, the rat oriented less
to the CS when it always predicted the US or nothing. But it continued to orient
when the outcome was uncertain from trial to trial. (B, after Kaye & Pearce, 1984.)
134  Chapter 4

conditioning. The group labeled “Continuous” received a condensed milk


US each time the light was turned on. Initially, orienting to the light was
high (the rats oriented when the stimulus was novel), but it declined over
conditioning as if the animal paid less and less attention to the light. The
second group, labeled “None,” received no milk after each presentation
of the light. Here again, orienting to the CS declined; it was perfectly good
at predicting nothing. The most interesting group was the one labeled
“Partial.” For these rats, the milk US was presented on a random half of
the trials, but none was presented on the other half. Mixing up US and no
US trials meant that each outcome was always somewhat surprising—the
model predicted that the value of α would remain high. And, as Figure 4.9B
indicates, that is exactly what Kaye and Pearce found. The results of several
experiments like this one suggest that orienting to the light is a function of
how surprising the US is on preceding trials (e.g., Wilson, Boumphrey, &
Pearce, 1992). Similar results have also been reported in humans, who gaze
more at a visual cue if it is unpredictably paired with an outcome half the
time instead of being paired with the outcome every time or none of the
time (Hogarth, Dickinson, Austin, Brown, & Duka, 2008).
The Kaye and Pearce (1984) results support the model’s explanations
of Hall-Pearce negative transfer (see Figure 4.8) and latent inhibition. With
enough conditioning trials in Phase 1, the CS becomes a perfect predictor
of the US. The US becomes unsurprising, and the value of α at the begin-
ning of Phase 2 should be very low. Therefore, on the first trial of Phase
2, little learning is possible. Ultimately, the negative transfer effect comes
about because the animal has paid less and less attention to the CS as he
has come to know more and more about it. The same mechanism explains
latent inhibition. When the CS is presented over and over in the first phase,
the lack of a US on those trials is not surprising. The natural amount of
attention paid to a novel CS was predicted to decrease (Group None in
Figure 4.9B). The rule did a nice job accounting for both problems and
made some new predictions, too (e.g., Hall & Pearce, 1982; Haselgrove,
Esber, Pearce, & Jones, 2010).
The Pearce-Hall model also makes a number of other predictions. Sur-
prisingly, many of them are quite similar to predictions made by Mack-
intosh’s model. For example, blocking is supposed to work because the
value of α becomes very low after the first compound trial—the US is
not surprising on Trial 1. Once again, blocking should only occur after a
minimum of two compound trials. Unfortunately, as I already mentioned
blocking is evident after only one trial. Neither Mackintosh nor Pearce and
Hall wanted that result to happen. Sometimes the results suggest that the
Rescorla-Wagner rule is at least partly on the right track.

A combined approach
So what are we to do? Perhaps it is best to accept the possibility that all the
models have something to contribute. For example, returning to attention,
although there is good evidence favoring the Pearce-Hall model’s account
Theories of Conditioning   135

of it, it would probably be a mistake to discard Mackintosh’s idea that we


can also attend to good, rather than uncertain, predictors. When animals
and humans learn discriminations like AX — US BX — no US, they do ap-
pear to pay more attention to the better predictors (A and B) than to the less
relevant cue (X) (e.g., Dopson, Esber, & Pearce, 2010; Le Pelley, Oakeshott,
Wills, & McLaren, 2005; see also Haselgrove et al., 2010; Mackintosh &
Little, 1969; Shepp & Eimas, 1964). The Pearce-Hall model cannot handle
this kind of result because attention should decline to all CSs as the sub-
ject comes to understand their meaning. How, then, can this kind of result
square with things like Hall-Pearce negative transfer, in which attention
declines to good predictors?
The new idea is that the overall attention to a CS might actually be a
joint product of two kinds of attention, the one proposed by Mackintosh
and the other proposed by Pearce and Hall (e.g., Le Pelley, 2004; Pearce &
Mackintosh, 2010). You can think of the two types of attention as having
different, but complementary, functions. The one emphasized by Mack-
intosh detects and tunes into predictors that no longer require mental
effort to understand. It allows us to exploit what we already know about
environmental cues to use them efficiently (Beesley, Nguyen, Pearson, &
Le Pelley, 2015). In contrast, the kind of attention emphasized by Pearce
and Hall processes stimuli that have uncertain meaning so that we can
learn more about them. It might involve a more controlled or effortful
form of processing of the CS (Pearce & Mackintosh, 2010) that functions
to explore and learn about things we do not already know (Beesley et al.,
2015). The two types of attention might actually work at the same time,
more or less as the original models proposed they do. This is the view
taken by hybrid attentional models, which combine the two attentional
processes and explain how they can work simultaneously (e.g., George
& Pearce, 2012; Le Pelley, 2004; Pearce & Mackintosh, 2010). Consistent
with a hybrid view, within a single experiment, humans tend to fix their
eyes on visual cues that are either high in predictive value or high in
the uncertainty of what they predict (Beesley et al., 2015). In the end,
the overall salience of a CS may thus depend on how high both types of
attention are on any trial.
Pearce and Mackintosh, who wrote a paper together to integrate their
ideas (Pearce & Mackintosh, 2010), also noted that if one further incorpo-
rates the basic Rescorla-Wagner equation describing the effectiveness of
the US, a blend of all three views can handle the results discussed so far
(including even that pesky blocking effect found on the first compound
trial). Thus, the Rescorla-Wagner, Mackintosh, and Pearce-Hall models
may each contain a kernel of truth, and the kernels can be combined (see
also Esber & Haselgrove, 2011; Le Pelley, 2004).

What does it all mean?


Some of the Rescorla-Wagner model’s shortcomings have been addressed
by noting that animals must pay attention to CSs in order to learn about
136  Chapter 4

them. From a purely practical perspective, it is useful to remember that


learning depends on attention. If attention is low, pairings of a CS and a
US may not be sufficient to cause learning. The important thing to know,
however, is that attention itself depends on previous learning. Both the
Mackintosh and Pearce-Hall models brought us a step forward in under-
standing how attention operates. Attention changes according to how well
unconditioned stimuli have been predicted by conditioned stimuli on a
previous learning trial. Following Mackintosh, you automatically attend
to CSs that are good predictors of significant events in the environment.
Following Pearce-Hall, you pay attention to a cue on Trial 2 if you did not
predict the correct outcome on Trial 1. According to contemporary hybrid
models, both kinds of attention operate at the same time and can work
along with the Rescorla-Wagner learning rule.

Short-Term Memory and Learning


Attention has been an important concept in cognitive psychology for many
years. It is usually placed in the information processing model of memory
that became popular in psychology beginning in the late 1950s, which I
mentioned briefly in Chapter 1 (e.g., Atkinson & Shiffrin, 1971; Broadbent,
1958). Interestingly, we can think about conditioning quite profitably using
the same kinds of concepts and terms.
A simple version of this system is shown in Figure 4.10. According to
the system, stimuli from the environment first enter sensory memory—a
hypothetical memory store that briefly holds visual and auditory informa-
tion as literal images or echoes. If the stimuli are attended to, they enter
another hypothetical store known as short-term memory. You are no
doubt familiar with short-term memory. Casually speaking, it is the space
where thinking occurs. It represents the kind of memory involved when
you need to remember a telephone number and have no way to write it

Self-
generated
priming
Sensory Short-term
S R
memory memory

Retrieval-
Figure 4.10  The standard model of cognition or generated
information processing. An item can be primed priming
in short-term memory through two routes: It can
enter from the external world (via sensory memory)
Long-term
through self-generated priming, or it can be re- memory
trieved from long-term memory through retrieval-
generated priming.
Theories of Conditioning   137

down. Information in short-term memory does not last very long unless
you rehearse it; the telephone number is lost if you do not keep repeating
it to yourself. Short-term memory is also said to have a limited capacity.
You can demonstrate that by shouting random numbers at a friend after
asking her to remember a phone number. Your friend can not hold all the
numbers in memory at once, and your shouting will essentially knock the
numbers out; short-term memory is brief and fairly small.
What we have called learning usually involves storage of information
in long-term memory, which is very different from short-term memory.
For one thing, information stored in long-term memory lasts almost in-
definitely. It also has an enormous capacity. Notice this the next time you
are in a trivia contest or play the game Trivial Pursuit. Even if you do not
play with world-class players, it is impressive to see how much trivial
information people seem to retain throughout their lifetimes. In point of
fact, animals also remember conditioning for quite a while (e.g., Gleitman,
1971; Hendersen, 1985; see Chapter 5). When a CS-US association is learned,
we assume that it has been stored in something like long-term memory.
The key idea is that storage of information from the environment in
long-term memory depends on the whole preceding chain of events. Stim-
uli must enter the system, must be attended to, and must be processed
somehow in short-term memory. Theorists often emphasize the short-term
memory step. If information is not processed sufficiently there, it simply
will not get transferred to long-term memory.
In the 1970s, Allan Wagner (1976, 1978) used this framework to extend
the Rescorla-Wagner model. In doing so, he more than doubled the range of
facts that it explained. He kept the crucial concept of surprise. He suggested
that a surprising event gets extensive processing in short-term memory,
which increases its chances of being stored in long-term memory. Surprise
is easy to conceptualize in the information processing framework. Casually
speaking, an event is surprising if you are not already thinking about it.
Within the information processing system, an event is surprising only if it is
not already present in short-term memory. Learning depends on the event
being surprising, but surprise is reduced if the event is already present in
short-term memory when the event actually happens.
An event that is already present in short-term memory before it hap-
pens is said to be primed in short-term memory. Priming reduces surprise.
The labeled arrows in Figure 4.10 show that there are two ways stimuli
can enter short-term memory, which means that there are two ways that
items can be primed. In self-generated priming, a stimulus enters short-
term memory from the external world via sensory memory. If an event is
presented, it primes itself in short-term memory. In retrieval-generated
priming, the item is called up out of long-term memory through a process
known as memory retrieval. A second way an item can be primed, then,
is by a retrieval cue pulling it out of long-term memory. Remember that
we now believe that CSs retrieve memory representations of the US (see
Chapter 3). In the information processing system, presenting a CS will
138  Chapter 4

pull the US out of long-term memory and prime it in short-term memory.


Surprise can be reduced in two ways: A retrieval cue can call the item up
out of long-term memory, or a recent presentation of the event can prime it.

Priming of the US
Now consider blocking, the finding that originally caused all this interest
in surprise. In Phase 1 of the blocking experiment, a noise is associated
with a US. When the noise is presented, it then retrieves the US represen-
tation from long-term memory and puts it into short-term memory. In
Phase 2, when light and noise are presented together, the noise immediately
primes the US into short-term memory. When the US happens at the end
of the trial, it is therefore not surprising. Blocking is the classic example of
retrieval-generated priming.
The new model suggests new possibilities. Most importantly, it suggests
that the surprisingness of the US should also be reduced by self-generated
priming. W. S. Terry, a graduate student working in Wagner’s laboratory,
investigated this idea (Terry, 1976). Rabbits received eyeblink condition-
ing with two CSs, A and B. Trials with A and B were intermixed. Four
seconds before each A-US trial, Terry presented the US. Trials with B were
not preceded by a US. Terry predicted that the US before the A-US pairing
would prime the US in short-term memory, making its occurrence with A
less surprising. What he found was consistent with this prediction: The
rabbits learned about A more slowly than they did about B.
In another experiment, Terry presented a distracting click and vibration
between the priming US and the A-US pairing. The click and vibration
allowed the rabbit to learn quite well about A. Terry suggested that these
stimuli entered short-term memory and knocked the US out after it had
been primed at the end of the A-US trial; the US was therefore surprising
again. The situation is like your shouting of numbers at your friend while
she was trying to remember the telephone number. Short-term memory
has a limited capacity, especially, perhaps, in rabbits. Wagner’s ideas about
short-term memory, priming, and the surprisingness of the US were nicely
confirmed by Terry’s experiments.

Priming of the CS
The Rescorla-Wagner model emphasized the surprisingness of the US. The
priming model expanded on this by giving us both retrieval-generated and
self-generated priming. But even more interesting, it claimed that surpris-
ingness of the CS was equally important. Learning depends on the joint
processing of CS and US in short-term memory. If surprisingness of the US
is important in determining processing, it is only natural to believe that
surprisingness of the CS is also important.
This idea gave Wagner a handle on one of the problems that the Re-
scorla-Wagner model could not deal with: latent inhibition. According to
the newer model, exposure to the CS before conditioning should reduce its
surprisingness; latent inhibition happens because the CS becomes less sur-
Theories of Conditioning   139

prising. Because Wagner had become quite specific about the mechanisms
of surprise, this idea led to new ideas about latent inhibition, which should
come about because of either self-generated or retrieval-generated priming.
The role of self-generated priming of the CS has been investigated in
several experiments. For example, Figure 4.11A illustrates the idea behind
an experiment I once ran with students Jay Sunsay and Lee Stetson (Sunsay,
Stetson, & Bouton, 2004). Different groups of rats received trials in which
one CS (e.g., a light) was paired with a food pellet US. On other trials,
another CS (e.g., a tone) was paired with the US. For a control group, no
events occurred near these conditioning trials (there were about 18 minutes
between each trial). But the other two groups received a priming stimulus
presentation 60 seconds before each trial. For the second group in Figure
4.11, we presented the CS that was about to be paired with the US. I hope
you see the idea—this first presentation of the CS should have primed the
CS in short-term memory, making it less surprising when it was presented
again and paired with the US. The third group was primed with the other
stimulus. This group received a prime, but it was with a stimulus that
should not reduce the surprisingness of the upcoming CS that was paired
with the US. Conditioned responding in the form of food-cup entries dur-

(A)
Control

Same prime

Different prime

Time
Figure 4.11  (A) Timeline of the
(B) experiment by Sunsay, Stetson, and
4 Bouton (2004) testing the effects of
self-generated priming of the CS.
Food-cup entries over baseline

3
One group (“Same prime”) received
a priming presentation of the CS be-
Control
fore every trial on which it was paired
2 Different prime
with the US (•). Another group (“Dif-
ferent prime”) received a priming
1 presentation of a different CS. (The
Same prime subjects received many trials like
0 the ones that are shown.) (B) Results
indicated that conditioning of the
food-cup entry response was slowest
–1
1 2 3 4 5 6 7 8 9 in group “Same prime.” (After Sun-
Blocks of four trials say, Stetson, & Bouton, 2004.)
140  Chapter 4

ing the CS-food pairings (averaged over tone and light) is shown in Figure
4.11B. Conditioning proceeded normally in the control group, and prim-
ing the irrelevant stimulus 60 seconds before each trial did little to change
the learning (the group labeled “Different prime”). However, the group
given the prime with the same CS (the group labeled “Same prime”) was
slower to learn to respond. This result is exactly what Wagner predicted.
Priming the CS in short-term memory before each CS-US pairing slowed
down the conditioning. Related results have been reported in taste aver-
sion learning, where presenting a distracting stimulus between the prime
and CS-US pairing reduced the effect of the prime (Best, Gemberling, &
Johnson, 1979). The results and interpretation are consistent with Terry’s
(1976) experiments on self-generated priming of the US described above.
So self-generated priming of the CS can interfere with learning, the
way that Wagner’s model says it should. But what about the topic we
were discussing before this brief digression—latent inhibition? In truth,
most experiments on latent inhibition involve a gap of 24 hours or so be-
tween preexposure to the CS and the beginning of conditioning. This gap
is presumably much longer than self-generated priming will last. Wagner’s
model actually contains a second mechanism that easily accounts for this
phenomenon: The CS may also be less surprising because of retrieval-
generated priming. Once again, we note that learning does not occur in
a vacuum; rather, it occurs in contexts. During the preexposure phase,
the CS may become associated with contextual cues. The animal may be
returned to its home cage overnight. As long as the context-CS association
is not forgotten, however, upon return to the same context the next day,
the context will retrieve the CS and prime in short-term memory, which
would make the CS less surprising. Latent inhibition may result from the
context causing retrieval-generated priming.
A simple way to test this idea is to preexpose the CS in one context (e.g.,
conditioning box A) and then pair the CS and US in a completely different
context (e.g., conditioning box B). If that is done, box B cannot prime the
memory of the CS; the CS should therefore be conditioned at an ordinary
rate. This prediction has been widely tested and confirmed (e.g., Hall &
Channell, 1985; Lovibond, Preston, & Mackintosh, 1984; Swartzentruber &
Bouton, 1986). If CS preexposure and conditioning occur in different—rather
than the same—contexts, conditioning occurs at a more rapid rate. A change
of context between preexposure and conditioning reduces latent inhibition.
The effect of context in latent inhibition is more consistent with Wag-
ner’s model than the attention models considered in the previous section
(Mackintosh, 1975a; Pearce & Hall, 1980; Pearce & Mackintosh, 2010). Those
models do not predict that a context switch would have such an impor-
tant effect on latent inhibition. Interestingly, a context switch also reduces
the Hall-Pearce negative transfer effect (Swartzentruber & Bouton, 1986).
That effect can also result from the context becoming associated with CS,
thereby making it less surprising. The fact that context is important in both
of these phenomena suggests that Wagner’s retrieval-generated priming
Theories of Conditioning   141

mechanism may have a role in both effects. It seems that the short-term
memory model of conditioning is promising indeed.

Habituation
In Chapter 2, we first encountered habitu- 6
ation, a common finding in studies of be-
havior and learning. Repeated exposure
4
to a stimulus usually leads to a decline in
strength of the response it originally elicited.
Wagner’s priming model provided a novel 2
and highly testable explanation of even this
phenomenon. If we assume that the response
evoked by the stimulus results from its sur- 0

Mean amplitude vasoconstriction


prisingness, habituation may result from the
S1 S2
decrease in surprise. With repeated exposure,
it may become less and less surprising. As 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6
usual, the model indicates that surprise can
be reduced by either self-generated priming 8 Relation of
or retrieval-generated priming. Wagner’s S1 to S2
priming model allowed, for possibly the Same
first time, full integration of habituation with 6 Different
other aspects of animal learning.
J. W. Whitlow (1975), then a graduate
student in Wagner ’s laboratory, exam- 4
ined the idea that habituation resulted
from self-generated priming. He exposed
2
rabbits to 1-second tones, which initially
caused a vasoconstriction response in the
ear. The rabbits were actually exposed to 0
a jumbled series of high- and low-pitched S1 D S2
tones presented 60 seconds apart. Half the
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6
time, tones were preceded by a tone of the
Successive 5-second intervals
same pitch, and the other half the time,
tones were preceded by a tone of a differ- Figure 4.12  Results of Whitlow’s (1975)
ent pitch. According to the priming model, experiment on the effects of self-generated
presenting a tone would prime it in short- priming on habituation. In the upper panel,
term memory, making it less surprising for a a tone presented in the second position (S2)
little while. This should have made the tone is preceded by either the same stimulus or
less surprising, but it should not have made a different stimulus (S1); notice that there is
a bigger response to S2 when the preceding
a different tone less surprising. As shown in
stimulus was different. In the lower panel,
Figure 4.12, there was less vasoconstriction when a distractor stimulus (D) was presented
(surprise?) when a tone had been preceded between S1 and S2, the difference between
by itself rather than by a different tone. But the same and different conditions disap-
when a distracting stimulus was presented in peared. The distractor knocked the tone
between, there was less difference between presented at S1 out of short-term memory.
the same and different conditions. These re- (After Whitlow, 1975.)
142  Chapter 4

sults are consistent with the idea that habituation here was caused, in part,
by self-generated priming in short-term memory.
An even more provocative idea is that habituation could be due to
retrieval-generated priming. Something beyond self-generated priming
is necessary because habituation can often last quite a long time between
exposures (e.g., Leaton, 1974). The idea, again, is that during exposure to
the stimulus, the animal might form an association between the stimulus
and the background context. When brought back to that context, the con-
text would retrieve the stimulus from long-term to short-term memory,
reducing its surprisingness. A clear prediction is that habituation should be
“context-specific”; that is, a change of context between repeated exposure
and a final test should make it impossible for the test context to cause prim-
ing. This prediction has now been tested in several species and with several
methods for studying habituation. Unfortunately, the prediction (and other
related ones) has not been confirmed in most experiments (e.g., Baker &
Mercier, 1982; Churchill, Remington, & Siddle, 1987; Hall & Channell, 1985;
Leaton, 1974; Marlin & Miller, 1981), although some results suggest that
habituation of some responses might be more context-specific than others
(Jordan, Strasser, & McHale, 2000; see also Honey, Good, & Manser, 1998).
For the most part, however, habituation in one context transfers quite well
to other contexts.
The trouble confirming context-specific habituation has implications
that go beyond habituation itself. For example, the priming model argues
that latent inhibition happens because preexposure to the CS habituates the
surprisingness of CS. Hall and Channell (1985) ran a latent inhibition ex-
periment using a small light CS mounted high on the wall, much like Kaye
and Pearce (1984) did in the experiment described in Figure 4.9. During
preexposure, orienting to the CS habituated as expected. When the context
was changed, habituation remained. But when the CS was then paired
with the US, the latent inhibition effect was lost. This outcome suggests
that habituation and latent inhibition may result from separable processes;
one (latent inhibition) is context-specific, and the other (habituation) is not.
Although Wagner’s model was correct in predicting that latent inhibition
would be lost with a change of context, it might not have pegged the cor-
rect reason. We will return to this issue in Chapter 5.

What does it all mean?


The short-term memory model expanded on the surprise idea first built
into the Rescorla-Wagner model. Surprise became more than a mathemati-
cal quantity: It depended on whether an event is already represented in
short-term memory, and it could be reduced by either of two priming
mechanisms. The theory pointed for the first time to the surprisingness of
the CS as well as the US, and it integrated habituation with other aspects
of learning. In fact, because processing of the CS in short-term memory
is analogous to paying attention to the CS, the priming model was able
to handle many of the attentional effects proposed by Mackintosh and by
Theories of Conditioning   143

Pearce and Hall. It integrated a very large amount of information. The


model was not perfect, but it certainly was impressive.
By the late 1970s, classical conditioning was being interpreted from an
explicitly “cognitive” framework. Due in large part to Wagner’s model, it
became conventional to speak of the role of short-term memory, retrieval,
and rehearsal when discussing conditioning. Wagner’s model showed how
this sort of account could be rigorous, and its power convinced people of
the fruitfulness of a cognitive approach. Conditioning is now thought to
depend crucially on how memories are processed in short-term memory.

Nodes, Connections, and Conditioning


Another way to think about long-term memory is to focus on the associa-
tions that are hypothetically stored there. As we saw in Chapter 3, it is
useful to assume that conditioning results in an association between rep-
resentations of the CS and US. Figure 4.13 shows an association between
a tone and a shock represented in long-term memory. Here, the tone-shock
association exists in memory alongside all sorts of other representations.
For instance, there is a representation of a light and a noise, which may or
may not also be associated with shock. In modern parlance, the memory
representations—which are activated when the stimuli are presented in the
real world—are known as nodes. The associations between nodes, which
vary in strength due to learning, are often known as connections. A whole
set of interconnected nodes is known as a network. As we saw in Chapter
1, the “neural network” or “connectionist” framework became popular in
psychology in the 1980s (e.g., McClelland & Rumelhart, 1985), and there
will be more to say about it in Chapter 6. For now, think of the connec-
tionist framework as an expansion of the way we began to think about
conditioning in Chapter 3. When a CS is presented, its node is activated,
and activation travels down existing associations to activate other nodes.

Light

Shock

Tone Figure 4.13  Long-term memory


as a place full of memory nodes
Noise (represented as circles) that may
Context
be associated with one another.
Associations (or connections) are
represented as arrows.
144  Chapter 4

Activation of a node is only temporary and is thus analogous to being


in short-term memory. Notice that a node will be activated either (1) when
the item occurs in the external world or (2) when another node associated
with it has been activated. These two means of activation are the same as
putting the item in short-term memory through (1) self-generated priming
and (2) retrieval-generated priming, respectively. Accordingly, in the 1980s,
Wagner and his associates began putting the short-term memory model
into this connectionist framework (Mazur & Wagner, 1982; Wagner, 1981;
Wagner & Brandon, 1989, 2001). In important ways, the model was the
same as before, but the new version expanded the model’s scope and power
even further. It is now the single most complete account of conditioning
and associative learning that is available.

Wagner’s “SOP” model


Wagner’s 1981 version of the model is known as SOP, which stands for
standard operating procedure or sometimes opponent process. The
name does not describe the model very well—although Chapter 9 will
have more to say about “opponent processes.” But the name stuck. This
approach accepts the idea that a CS or US node becomes activated when
the stimulus is presented in the real world. But the model assumes that
activation has two levels of intensity, both of which are illustrated in Figure
4.14. When a CS or a US is presented, its node is first activated to a state
called A1. This state is a high level of activation and is analogous to an item
being in focal attention. The node stays in A1 only briefly. It soon decays
to a lower level of activation, known as A2, which is analogous to an item
being in peripheral attention. The node stays in A2 for a somewhat longer
period, but then eventually returns to the inactive state, where it remains
indefinitely until the node is activated again. It is a fixed chain of events.
Once the node is in A1, it always decays to A2, and from there it always
becomes inactive. Once the node is in A2, it must always become inactive
before it can go to A1 again; it cannot go directly from A2 to A1.
Three other ideas complete the picture of how SOP operates. First, an
association will be formed between two nodes (like a CS and a US) only
if they are both activated to a state of focal activation—A1—at the same
time. This is like having them rehearsed together in short-term memory;
as we saw before, it is what allows learning to happen on conditioning
trials. Second, once the CS and US are associated, the direct activation of
the CS node will now activate the US node—but not to its highest level of
activation. The CS activates the US node only to the level of A2. This makes
sense: A memory activated by a retrieval cue (and put in A2) is never
quite as vivid as the real thing (in A1). Third, each node is actually made
up of a large set of elements that are activated when the node is said to be
activated. It is technically the node’s elements that go to A1, decay to A2,
and then become inactive again. When I say that a node is in A1 or A2 at a
particular time, what I really mean is that a large proportion of the node’s
elements are in that state at that time (see Figure 4.14B).
Theories of Conditioning   145

(A) Memory node Figure 4.14  Activation of a


A1 memory node in SOP theory.
(A) When the stimulus is pre-
A2 sented, the node goes into A1,
decays to A2, and then becomes
Inactive
inactive again. (B) Activation of
the node depends on the pro-
Stimulus
portion of elements within the
Time node that individually go from
A1 to A2 and then become inac-
(B) Elements tive. Some elements decay more
quickly than others; activation
of the node actually reflects the
Proportion of total elements

A1 proportion of elements in A1 or
A2 at any given time.

A2

0
Time
Stimulus

Let us stop for a moment and consider what these ideas can do. When
a CS and a US occur together on a conditioning trial, the CS and US nodes
are in A1 together for a while; the strength of the association between them
will consequently increase a little bit. The association will get stronger and
stronger with each conditioning trial. As the association becomes stronger,
the CS becomes better and better at activating elements in the US node on
each trial, but only to the level of A2. Notice what this will do. The fact
that the CS activates elements in the US node to A2 will now prevent them
from going to A1 when the US actually occurs on the trial (remember that
you cannot go directly from A2 to A1). In the long run, the US becomes
less and less effective at causing new increments in conditioning because
it is already expected and not surprising (it is in A2 and cannot go to A1).
We have seen this idea before. As a consequence, there is less and less of an
increase in associative strength on each trial, and the usual learning curve
with an asymptote is observed (see Figure 4.1).
The same ideas explain the priming effects that were so new and im-
portant to the short-term memory model (see pp. 137–141). Remember
that priming a stimulus in short-term memory made it less surprising and
therefore less available for learning. In SOP, priming a stimulus puts the
corresponding memory node in A2, which prevents it from going back to
A1. In self-generated priming (e.g., Best et al., 1979; Sunsay et al., 2004;
Terry, 1976), presenting a CS or a US just before a conditioning trial puts
146  Chapter 4

(A) US soon after CS

A1 A1 CS
US

A2
A2

CS US

(B) US longer after CS

A1 A1
Proportion of total elements

A2 A2

CS US

(C) US before CS

A1 A1

A2 A2

US CS
Time

Figure 4.15  Activity in a CS node and US node when the US occurs (A) soon after
the CS, (B) longer after the CS, and (C) before the CS. These conditions describe
delay conditioning, trace conditioning, and backward conditioning, respectively.
Theories of Conditioning   147

it briefly in A1, but it soon decays to A2. If the node is still in A2 when the
stimulus is presented on the next conditioning trial, it cannot go back to
A1, and excitatory learning cannot occur. In retrieval-generated priming,
presenting a retrieval cue puts the associated node in A2. Once again, the
retrieved stimulus cannot be put in A1 while it is in A2, and excitatory
learning cannot occur (e.g., Kamin, 1969). Distractors can knock primed
information out of short-term memory (see Figures 4.11 and 4.12) because
only a limited number of nodes in the entire memory system can be ac-
tive at any one time. The familiar ideas about memory and surprise are
all found in SOP in a new form. The model builds on the successes of the
earlier models.
The SOP model also does much more than the earlier models. One
of the most famous factors that affect the strength of conditioning is the
timing of the CS and US (see Chapter 3). Believe it or not, none of the
theories we have discussed so far can explain this fundamental effect. But
SOP does—one of the most important things to know about SOP is that it
explains the effects of time on conditioning. Figure 4.15A shows activity
in a CS node and a US node when the US occurs soon after the CS. In this
arrangement, the CS and US nodes are simultaneously in the A1 state for
a good period of time, and conditioning will therefore occur. However, if
US presentation is delayed, fewer elements of the CS would be in the A1
state, and less conditioning will occur (Figure 4.15B). Because individual
elements vary in how quickly they decay, the proportion of CS elements
that are in A1 gradually decreases as time goes by; the model therefore
predicts smooth trace interval functions like the ones shown in Figure 2.14.
Backward conditioning, in which the US is presented before the CS, il-
lustrates another important feature of SOP. Here the US is in A2 when the
CS is put into A1 (Figure 4.15C). These are the conditions that will cause
an inhibitory CS-US association to be formed. To state it more clearly, an
inhibitory association will develop if the CS node is in A1 at the same time that the
US node is in A2. Backward conditioning can therefore lead to conditioned
inhibition. Notice, though, that inhibitory conditioning will actually de-
pend on the precise timing of US and CS. If there is too long a gap between
US and CS, the US node will have decayed from A2 to the inactive state
before the CS is presented; neither inhibition nor excitation will be learned.
Also notice that if the CS follows the US very closely in time, the US may
still be in A1 when the CS is put in A1, which would lead to excitatory
backward conditioning. In SOP, timing is (almost) everything. In contrast,
the previous models had ignored the effects of time.
SOP’s inhibition principle also explains how inhibition is learned dur-
ing compound conditioning. Remember that in the traditional “conditioned
inhibition” procedure, one CS (A) is paired with a US, and on other trials,
it is presented with another stimulus (X) without the US. In this procedure,
X becomes a conditioned inhibitor. Interestingly, this is entirely consistent
with SOP’s inhibitory learning principle. Thanks to conditioning on the
A-US trials, on the nonreinforced compound trials (AX — No US) A will
148  Chapter 4

activate the US node to the A2 state, and X will be presented at the same
time that the US node is in A2. These are the conditions for inhibitory
learning: X is in A1 while the US node is in A2. SOP explains why back-
ward conditioning and the conditioned inhibition procedure both produce
inhibitory conditioning. I hope that you can see that SOP addresses a truly
remarkable range of conditioning data.

Sensory versus emotional US nodes


Eight years after SOP was first introduced, Wagner and his colleague Susan
Brandon expanded the model in an important direction. They noted that
USs often have emotional as well as sensory qualities (Wagner & Brandon,
1989; see also Konorski, 1967). For example, when a rabbit receives a shock
US near the eye, the shock arouses an affect or emotion, such as fear, but
the rabbit also perceives that the shock was delivered specifically to the
left eye. Similarly, if you are nearly hit by a car while riding your bicycle,
you may experience a strong emotional reaction, but you might also notice
the color and the make of the car that nearly hit you. An expanded ver-
sion of SOP, called AESOP, for “affective extension of SOP,” assumes that
a US presentation actually activates two US nodes: a sensory node that
corresponds to the stimulus’s specific sensory qualities and an “emotive”
node that corresponds to its affective qualities. During conditioning, the
CS becomes associated with both of these nodes in parallel (Figure 4.16),
and they each control a different response. Both associations are learned
according to the rules described above. That is, presentation of the CS and
the US cause the CS and the emotive and sensory US nodes to first go to
A1, then A2, and then to inactive. The two associations are learned when
the CS node and the corresponding US node are in the A1 state at the same
time. The only difference is that the emotive node is assumed to move be-
tween the A1, A2, and inactive states more slowly than the sensory node.
This state of affairs is consistent with your bike-riding experience: The
emotional effects of that near-accident on your bike tend to persist much
longer than the sensory experience does.
Consider a rabbit in an eyeblink experiment in which
the CS is paired with a US delivered near the left eye. As
US Eyeblink the association between the CS and the emotional US node
Sensory
CS strengthens during conditioning, presentation of the CS
US Fear begins to activate the emotional node, which evokes fear
Emotive responses. At the same time, as the association between the
CS and the sensory node becomes stronger, the CS activates
Figure 4.16  AESOP envi-
the sensory node, which causes the rabbit to blink its left
sions parallel associations
between the CS and sensory eye when the CS is presented. (The fact that the response is
and emotive US nodes. When so specific to the left eye—not the right one—is consistent
these nodes are activated by with the idea that it is specific to a sensory aspect of the
the CS, they cause different US.) Both kinds of responses happen when the CS activates
conditioned responses. the corresponding US node to A2. But because the emotive
Theories of Conditioning   149

node stays in A2 much longer than the sensory node, the fear state persists
longer than the eyeblink. The model acknowledges that CSs do not really
evoke only one response; in fact, they may elicit multiple responses in par-
allel. Emotive and sensory conditioning are not completely independent,
though. The theory also proposes that the emotive response will invigorate
the sensory response (see Chapter 5).
Now return your thoughts once more to backward conditioning. Here
the US precedes the CS. When the US is presented, both the sensory and
emotive US nodes will go immediately into A1 and then decay to A2.
However, the decay will be far quicker for the sensory node. Given this
circumstance, it should be possible to present the CS at a point in time when
the sensory node has moved to A2 while the emotive node is still in A1. Do
you see what this predicts? The sensory association that results should be
inhibitory—the CS is associated with the sensory node in A2. The emotive
association, however, should be excitatory—the CS is associated with the
emotive node in A1. Thus, the CS should inhibit the eyeblink CR at the
same time it excites fear. This rather striking prediction is consistent with
results reported by Tait and Saladin (1986) and McNish, Betts, Brandon,
and Wagner (1997).
AESOP acknowledges that USs have multiple qualities that we learn
about in a parallel fashion. This is generally consistent with what we know
about the brain, which has separate systems that process different aspects
of events in parallel. The distinction between emotive and sensory con-
ditioning is also important to keep in mind when we apply conditioning
theories to the real world. For example, one of the major symptoms of
posttraumatic stress disorder is that patients typically “reexperience” both
sensory impressions and emotional responses that are associated with a
very traumatic event in their lives (e.g., Conway, 2005; Ehlers, Hackmann,
& Michael, 2004). Sensory and emotive conditioning may play a role in this
phenomenon. For example, a friend of mine (who happens to be an expert
on autobiographical memory) once told me of an acquaintance who was
in an accident in which the train he was riding was derailed. Because the
train was powered by electric wires overhead, a bright blue flash occurred
as the train was disconnected and careened off the tracks. Although the
man survived, to this day an encounter with a bright blue flash (e.g., the
light on top of a police car) evokes in him both visual memories of images
surrounding the accident (activation of associated sensory nodes?) and an
emotional response (activation of the associated emotional node?). That
is, the CS has both sensory and emotional effects. Interestingly, according
to AESOP, because the emotional effect of the accident lasted longer than
the sensory effect, a large number of CSs that both preceded and followed
the train going off the rails might be able to evoke an emotional response.
AESOP’s distinction between emotional and sensory aspects of condition-
ing also yields other predictions that have been tested and confirmed (e.g.,
Betts, Brandon, & Wagner, 1996).
150  Chapter 4

Elemental versus configural CS nodes


Wagner and Brandon went on to expand their thinking about what goes on
in the CS node when it is activated (Wagner, 2003, 2008; Wagner & Brandon,
2001). Their ideas were largely stimulated by the work of John Pearce, the
theorist who was 50% responsible for the Pearce-Hall model discussed
above (Pearce & Hall, 1980). Beginning in the late 1980s (e.g., Pearce, 1987,
1994, 2002), Pearce and his students began emphasizing certain facts about
conditioning that no theory discussed so far can handle. One of the simplest
is a phenomenon that Pavlov discovered, which he called external inhibi-
tion (Pavlov, 1927). Suppose that a subject is conditioned so that a CS (A) is
paired repeatedly with a US. That is simple enough: A will gain in associa-
tive strength until it reaches an asymptote. But now, after conditioning A,
let us present it together with a new, untrained stimulus, B. Conditioning
with A is now followed by tests of AB. Even though B has zero associative
strength, the addition of B somehow reduces the response to A (e.g., Pavlov,
1927). Intuitively, it occurs because the tested stimulus (CS AB) is different
from the one that was associated with the US (CS A). There is imperfect
generalization from A to AB. Adding B to A thus causes generalization
decrement. But surprisingly, the models we have considered so far do
not explain or predict the effect of adding B to A. In each of them, A has
acquired associative strength, but because B has none, it should have no
effect on responding to A.
Pearce (1987, 1994) explained this kind of result by assuming that the
organism will respond to the entire set of stimuli present on a conditioning
trial as if it has been presented as a single stimulus. In the first phase of the
external inhibition experiment, the animal learns to associate A and the US,
but when presented with the compound (AB), the animal simply responds
to this stimulus to the extent it is similar to the previously conditioned one.
Pearce suggests that the similarity between two stimuli (like A and AB) is
roughly the percentage of stimuli they share in common. Because A is 50%
of the stimuli in AB, the similarity between A and AB is 50%. To predict the
relative amount of responding one would receive when presented with AB
after conditioning of A, you simply multiply the associative strength of A by
the similarity between A and AB (50%). The theory thus explains external
inhibition. It also predicts the exact same effect if AB is first conditioned
and then A is tested alone. Here again, the organism is tested with a CS
that has 50% in common with the stimulus that was conditioned in Phase
1. Pearce has shown that a full quantitative model using this idea at its
core can explain many of the basic facts of conditioning, such as blocking,
conditioned inhibition, and so forth.
Pearce’s approach is called a configural theory because it assumes
that we learn about the entire set or “configuration” of stimuli that are
presented on any trial. When a compound made of several CSs (like A, B,
and C) is paired with the US, the approach claims that we learn a single
association between “ABC” and the US. This idea is illustrated in Fig-
ure 4.17. In contrast, the theories we have previously considered are all
Theories of Conditioning   151

elemental theories because they assume that each of the sepa- (A) Elemental
rate elements in a compound (A, B, and C) are individually as-
sociated with the US (also illustrated in Figure 4.17). According A
to Pearce, this thinking is wrong in that we do not learn all those
separate associations. We learn a single association between ABC B US
and the US, and then we respond to other stimuli depending on
how similar they are to ABC. C
The Pearce model stimulated lots of experiments that con-
firmed many new predictions. It is especially good at explaining
(B) Configural
discriminations in which organisms learn to respond differently
to CSs when they are presented alone and when they are com-
ABC US
bined. For example, in positive patterning, two CSs are paired
with the US when they are presented together, but not when
Figure 4.17  Organisms
they are presented alone (AB+, A–, B–). The animal learns to re-
might learn about
spond to AB, but not to A or B presented separately. Conversely, either (A) elemental or
in negative patterning, CSs are paired with the US when they are (B) configural CS nodes
presented alone, but not when they are combined (A+, B+, AB–). when a compound CS
Here, animals learn to respond to A and B presented separately, (ABC) is associated with
but not when they are combined in the compound. The negative a US.
patterning discrimination would be impossible to learn if animals
could only learn separate associations to the elements A and B;
these associations would summate when A and B are combined, forever
(and incorrectly) producing more responding to AB than to A or B! For this
reason, the negative patterning discrimination is considered a classic case
that proves the existence of configural cues. In the Pearce model, A and B
each acquire excitation, and the AB configuration acquires inhibition that
offsets the excitation that generalizes to it from A and B.
Pearce’s model also predicts other new results that other models do not
(for reviews, see Pearce, 2002; Pearce & Bouton, 2001). For example, think
about an animal learning to discriminate between stimulus A and stimuli
A and B combined (A+, AB–). It seems obvious that this discrimination
will be easier than one in which a third CS (C) is added to both types of
trials (AC+, ABC–). Intuitively, AC and ABC are more similar and should
be harder to discriminate; in the model, they share a larger percentage of
elements. Pearce and Redhead (1993) confirmed that this is the case (see
also Redhead & Pearce, 1995). Amazingly, elemental models of condition-
ing do not predict this simple finding. They seem to underestimate the
importance of the similarity between compounds that are associated with
US and no US.
The success of the Pearce model has tempted many researchers to aban-
don the idea that organisms associate each element separately in favor of
a configural approach. However, Wagner and Brandon went on to show
that an elemental theory like SOP can explain most of the newer findings
(Wagner, 2003, 2008; Wagner & Brandon, 2001). To put it simply, they sug-
gest that activity in the CS node fundamentally changes when the CS is
presented along with another stimulus. Consider your perception of musi-
152  Chapter 4

cal notes. If you strike middle C on a piano, it sounds one way when you
hit the key alone, but another way when you strike it along with the notes
E or E-flat. (E and E-flat along with C create the sounds of a major chord
and minor chord, respectively.) The extra note seems to change the quality
of the original note. SOP now supposes that something like this happens
when CSs are presented together in compounds—especially when they
are from the same sensory modality. Remember that the model actually
assumes that each CS node is made up of a lot of little elements. Each time
CS A is presented, it activates a number of elements in the corresponding
node. Some of these elements are activated whenever A is presented. But
another set of elements are activated when A is presented with a second
stimulus, like B. These new elements replace some of the elements that are
activated when A is presented alone. In this way, when B is added to A, the
compound AB does not activate all the elements in A that were associated
with the US when A was paired with the US alone. Adding B to A thus
weakens the conditioned response. That is how external inhibition can be
explained by an elemental theory.
Some interesting research has contrasted SOP’s “replaced elements”
conception with Pearce’s configural approach. For example, Brandon,
Vogel, and Wagner (2000) studied eyeblink conditioning in two groups
of rabbits. As illustrated in Figure 4.18A, one group received condition-
ing with a CS, A, alone, while another group received conditioning with
the compound ABC. The three CSs were a light, a tone, and a vibrotac-
tile stimulus, all from different sensory modalities. Once conditioning in
both groups reached its maximum, all rabbits received tests with A, AB,
and ABC. How much would conditioning from the first phase generalize
to the new test stimuli? Notice that the group trained with A alone had
CSs added during testing (AB and ABC include added CSs), whereas the
group conditioned with ABC had CSs removed from the compound dur-
ing testing (AB and A dropped C and B). According to Pearce (1987, 1994),
either adding or removing CSs should have the same effect: No matter
how you slice it, A and ABC share 33% of their elements, so the same drop
in responding should occur when rabbits conditioned with A are tested
with ABC and when rabbits conditioned with ABC are tested with A. The
“replaced elements” conception suggests a possible difference, however:
After conditioning with A, adding new elements should cause a drop in
responding because the new stimuli replace some of the elements activated
by A. Subtracting CSs, though, should be an even bigger deal because SOP
(like all elemental models since Rescorla-Wagner) assumes that when the
compound ABC is conditioned, A, B, and C will each compete with one
another for association with the US. Each would acquire one-third of the
available associative strength. (For example, in the Rescorla-Wagner model,
all three CSs must share λ.) Therefore, dropping C from ABC should re-
move about 33% of conditioned responding, and dropping BC from ABC
should remove a whopping 66%. There should be a huge effect of remov-
Theories of Conditioning   153

(A) (B)
Test stimulus 100
Conditioned with A
A AB ABC
80
Stimulus A
paired with 60
the US ABC
40

20

Mean percentage CR
0

100
Conditioned with ABC
80

Figure 4.18  (A) Design and 60


(B) results of the experiment by
40
Brandon, Vogel, and Wagner
(2000). Adding B and C to A after
20
conditioning of A has less effect
than removing B and C after con- 0
ditioning of ABC. (B, after Bran- A AB ABC
don, Vogel, & Wagner, 2000.) Test stimulus

ing CSs from a conditioned compound, but possibly a much smaller effect
from adding them to a single CS.
The results of the test trials are shown in Figure 4.18B. Consistent with
the replaced-elements idea, adding stimuli to A caused a smaller drop in
responding than removing stimuli from ABC. Only the replaced elements
idea can explain this difference. Other differences in the effects of adding
or removing CSs have been reported (e.g., Bouton, Doyle-Burr, & Vurbic,
2012; González, Quinn, & Fanselow, 2003; Rescorla, 1999a). At present, a
“replaced elements” version of elemental theory appears to work better
than a configural theory, but research on this kind of question continues to
uncover new issues (e.g., Harris & Livesey, 2010; Harris, Livesey, Gharaei, &
Westbrook, 2008; Grand & Honey, 2008; Melchers, Shanks, & Lachnit, 2008).

What does it all mean?


Nowadays, when people use the term “SOP theory,” they usually mean
the original model (Wagner, 1981) expanded to include sensory and emo-
tional US nodes (i.e., AESOP; Wagner & Brandon, 1989) and replaceable
elements in the CS nodes (Wagner, 2003, 2008; Wagner & Brandon, 2001).
Even from the beginning, SOP was the most comprehensive account of
154  Chapter 4

Pavlovian learning available. It is really quite impressive what it can now


do. SOP provides a way to combine the compound conditioning effects
addressed in the Rescorla-Wagner model with the attention and priming
effects addressed in the Mackintosh, Pearce-Hall, and short-term memory
models, and it does all that while also providing a good account of the ef-
fects of time, emotional and sensory aspects of conditioning, and stimulus
configuration and generalization. In a way, the SOP model provides the
overall scientific “paradigm” (Kuhn, 1962) in which almost all contem-
porary research on Pavlovian learning can be described and understood.
Some researchers worry that the model’s scope and complexity might
make it difficult to falsify. I am not particularly convinced by this argument.
For example, we have just seen that it is possible to design new experiments
that test the model’s ideas and assumptions. Furthermore, some complex-
ity may be necessary to explain conditioning, which looks awfully simple
at first but actually has a wonderful richness, depth, and complexity to
it. The fact is, SOP can explain, organize, and integrate an impressively
broad range of findings in Pavlovian learning, one of the most subtle and
thoroughly investigated problems in psychology.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. Theories of conditioning have made steady progress. Since the early
1970s, they have been able to account for an increasingly wide and
sophisticated range of effects that have been discovered in classical
conditioning.
2. The Rescorla-Wagner model is built on the idea that learning depends
on the surprisingness of the US. If the US is perfectly predicted on a
conditioning trial, it is not surprising, and no learning will occur on that
trial. If the US is not predicted accurately, there is surprise, and the CSs
present on the trial either gain or lose associative strength accordingly.
The associative strengths of all CSs present on a trial add together to
determine what is predicted.
3. The Rescorla-Wagner model assumes that excitors are CSs that acquire
a positive associative strength during conditioning. Inhibitors are CSs
that acquire a negative strength. To explain the effects of zero and
negative contingencies between a CS and a US (which cause no learn-
ing and inhibition, respectively), the model assumes that background
contextual stimuli acquire associative strength and then influence the
CS as any other CS would.
4. Research stimulated by the Rescorla-Wagner model makes it clear that
conditioning is not a simple-minded matter of associating two events
Theories of Conditioning   155

that are paired. Pairing a CS and a US is not always good enough to


cause learning; under some conditions, pairing a CS with a strong US
can actually reduce associative strength or even cause the learning of
conditioned inhibition. Conversely, presenting a CS without a US does
not necessarily result in extinction or conditioned inhibition. Learning on
any conditioning trial depends crucially on the associative strength of all
cues present on the trial.
5. Despite its successes, the Rescorla-Wagner model has several short-
comings. It incorrectly predicts that inhibition will be lost if an inhibitor
is presented without a US. It does not explain latent inhibition. And
it also fails to predict results that suggest that blocking may result at
least partly from the animal learning not to pay attention to redundant
predictors of the US.
6. The Mackintosh and Pearce-Hall models propose that learning is always
influenced by the animal’s attention to the CS. The amount of atten-
tion paid to a CS changes over trials; in fact, it is controlled by previous
learning. According to the Mackintosh model, animals pay attention to
relatively good predictors and tune out bad or redundant ones. Accord-
ing to the Pearce-Hall model, animals attend to uncertain predictors;
they will still attend to a CS on Trial 2 only if the US was poorly pre-
dicted on Trial 1. Hybrid models suggest that both of these attentional
processes can operate and play complementary roles.
7. Wagner’s short-term memory model expanded on the Rescorla-Wagner
model by proposing that learning is determined by the surprisingness
of both the US and the CS on a conditioning trial. Either stimulus is not
surprising if it has been “primed” in short-term memory. Priming occurs
(1) if the stimulus has been presented recently (self-generated priming)
or (2) if a memory of the stimulus has been retrieved from long-term
memory by a stimulus previously associated with it (retrieval-generated
priming). Contextual cues associated with the CS or US can cause
retrieval-generated priming.
8. The short-term memory model explains habituation by proposing that
a stimulus becomes less surprising with repeated exposure. Self-gener-
ated and retrieval-generated priming should both play a role, although
the implication that habituation should be lost with a change of context
has not been confirmed.
9. During conditioning, the subject may learn an association between
nodes in long-term memory that correspond to the CS and US. Wag-
ner’s SOP model proposes that the nodes can be activated to two levels
of intensity: A1, which roughly corresponds to focal awareness; and A2,
which corresponds to peripheral awareness. When a stimulus is pre-
sented, its node is briefly activated to A1; from there, it decays to A2.
An association is formed between two stimuli only if they are both in
A1 at the same time. SOP makes most of the predictions made by the
156  Chapter 4

short-term memory model, but it accounts for even more phenomena,


such as the crucial effects of CS and US timing on conditioning.
10. AESOP, an extension of SOP, proposes that USs have both “sensory”
and “emotive” nodes. Activation of these nodes evokes different kinds
of responses. Emotive nodes control emotional responses, and once
activated, they take more time to decay than do sensory nodes.
11. An additional extension of SOP further proposes that activation of the
CS node is different when the CS is presented alone or along with other
stimuli. This allows the theory to explain certain “configural” effects in
which organisms respond differently when the CS is presented alone
versus being presented in a compound with other stimuli.
12. Although conditioning theories will continue to change and advance,
SOP theory (with its extensions) is a powerful theory that can account for
many of the most important effects in Pavlovian learning.

Discussion Questions
1. What is prediction error? How does prediction error make learning
happen? What is its role in the Rescorla-Wagner model, the Mackintosh
model, and the Pearce-Hall model? How do you think the concept is
represented in SOP?
2. How do the Mackintosh and Pearce-Hall models conceptualize atten-
tion? How do they capture the idea that learning (conditioning) de-
pends on attention? Conversely, how do they propose that attention
depends on conditioning? (Hint: Think prediction and prediction error.)
3. What is the role of short-term memory in conditioning? Is there a role
for short-term memory in Wagner’s SOP model?
4. Suppose that the car you are driving one rainy night skids off the road
and crashes into a ditch while you are negotiating a curve. You are
not hurt, but this event constitutes a major conditioning trial. Use the
concepts in SOP (and AESOP) to conceptualize what happens, and what
you might learn at the sensory and emotional levels, as your car skids
off the road. How will you respond the next time you approach that
curve?
Theories of Conditioning   157

Key Terms
A1  144 inactive  144 retrieval-generated
A2  144 long-term memory  137 priming  137
comparator theory  127 negative self-generated
configural theory  150 patterning  151 priming  137
connections  143 negative transfer  132 short-term
nodes  143 memory  136
context  125
overexpectation sometimes opponent
contextual stimuli  125
effect  124 process  144
elemental theory  151
positive patterning  151 SOP theory  144
external inhibition  150
prediction error  116 standard operating
generalization procedure (SOP)  144
decrement  150 priming  137
surprisingness of the
Hall-Pearce negative protection from
US  116
transfer  132 extinction  121
hybrid attentional
model  135
Chapter Outline
Memory and Learning  160 Other forms of modulation  186
How well is conditioning What does it all mean?  187
remembered? 160 Understanding the Nature of the
Causes of forgetting  163
Remembering, forgetting, and Conditioned Response  187
extinction 166 Two problems for stimulus
Other examples of context, ambiguity, substitution 188
and interference  171 Understanding conditioned
Can memories be erased?  173 compensatory responses  190
Interim summary  177 Conditioning and behavior systems  193
What does it all mean?  197
The Modulation of Behavior  177 Conclusion 199
Occasion setting  178
Three properties of occasion setters  181
Summary 200
What does it all mean?  183 Discussion Questions  201
What is learned in occasion setting?  184
Configural conditioning  186 Key Terms  202
chapter

1
5
Whatever Happened
to Behavior Anyway?

C hapter 4 was full of details about how information is


processed and stored in memory during Pavlovian
learning (stimulus learning). Theories of conditioning
were covered in some detail because they help explain
the conditions that lead to learning and how learning
actually proceeds. Experiments on classical condition-
ing actually provide some of the most systematic infor-
mation that psychologists have on how basic learning
processes operate. But it is also worth asking how all
the “knowledge” that is learned in conditioning (e.g.,
associative strength, a CS-US association, etc.) even-
tually gets translated into behavior. Historically, that
was what the field of Learning Theory was all about;
since the days of John Watson, it has been interested
in generating principles of behavior. But somewhat
ironically, you might have noticed that, except for one
or two sections, Chapter 4 was fairly quiet about be-
havior. For the most part, conditioning theories focus
on the learning process itself rather than on the critical
changes in behavior that learning ultimately results in.
To steal a quote from Edwin Guthrie, a learning theo-
rist from the 1930s and 1940s, modern conditioning
theories have “left the rat lost in thought.”
This chapter is designed to help restore a little
balance. Whereas Chapter 4 addressed the learn-
ing process in Pavlovian conditioning, in this chap-
ter we will discuss performance processes, or how
stimulus learning gets translated into behavior.
160  Chapter 5

The chapter considers performance processes in several ways. We begin


with the idea that, to influence performance, learned information must be
remembered, activated, or retrieved. We will therefore look a little deeper
at factors that govern memory and memory retrieval, the initial step in
translating learning back into performance. In a way, the question is, After
learning has occurred, what is necessary for conditioned responding to be
evident at all? The discussion then leads us into a related topic that has
figured prominently in conditioning research since the 1980s: the so-called
modulation of learned CS-US associations (e.g., Swartzentruber, 1995). Re-
sponding to a CS can be influenced by other cues in the background that
“set the occasion” for the response, perhaps the way cues are sometimes
thought to control operant responses (see Chapter 1). Finally, in the last
section of the chapter, we will consider what determines the actual form
of the behavior that the CS evokes. That is, when conditioned responding
is evident, what does it look like, and why? This section will return us to
some issues first discussed in Chapter 2, where we saw that signals for bio-
logically significant Outcomes evoke a wide range of interesting behaviors
that are generally organized to get us ready for O. We need to dig a little
deeper into how all that actually works.
As usual, I hope you discover some other interesting effects and phe-
nomena as we learn what we can about performance processes. Let us
start by first considering retrieval and remembering as well as their rather
interesting opposite, forgetting.

Memory and Learning


How well is conditioning remembered?
Once Pavlovian learning has occurred, it can influence performance for a
long time. That is, a CS that undergoes conditioning at one point in time
can still evoke a conditioned response long into the future. In this sense,
animals appear to remember conditioning for impressive intervals. For
example, Robert Hendersen (1985) paired a CS with a mild, a medium,
or a strong electric shock in different groups of rats. He then tested their
memory for conditioning either 1 day or 60 days later by presenting the
CS and measuring how much fear it still evoked. Fear was indexed by
the extent to which the CS suppressed licking water from a tube. (If you
think a 60-day interval is not very long, consider the fact that the rats were
probably only about 90 days old at the start of the experiment.) Figure 5.1
shows the results. The memory of conditioning was almost perfect after
the 60-day “retention interval.” That is, there was no evidence that the rats
forgot anything between the 1- and 60-day intervals. The rats even behaved
as if they rather accurately remembered the intensity of the shock that had
been associated with the CS. The results are consistent with other results
suggesting that conditioning—with either aversive or appetitive USs—can
be retained surprisingly well over time (e.g., Gale et al., 2004; Gleitman,
1971; Hoffman, Selekman, & Fleshler, 1966).
Whatever Happened to Behavior Anyway?   161

150 Figure 5.1  Conditioned fear 1 day and 60


1 day days after conditioning. The measure of
60 days conditioning was suppression of water licking,
a variation on the conditioned suppression
procedure. Regardless of the intensity of the
US used in conditioning, there was virtually
100 no forgetting after 60 days. (After Hendersen,
1985.)
Licks

50

0
0 0.5 1.0 2.0
Intensity of the US (mA)

This is not to say that conditioning is always remembered perfectly.


Animals do sometimes forget details about the CS or the US. For example,
over time, the animal is more likely to respond to stimuli that are differ-
ent from the original CS. That is, the animal begins to generalize more to
other stimuli (e.g., Perkins & Weyant, 1958; Thomas & Lopez, 1962). It
is as if the animal’s memory of the CS becomes a little fuzzy over time.
The animal’s memory of the US can become a little fuzzy, too; it may also
increasingly generalize between different USs (Hendersen, Patterson, &
Jackson, 1980). This type of forgetting of the details of stimuli (or stimulus
attributes), which occurs in both animals and humans, has some interesting
implications (see Riccio, Rabinowitz, & Axelrod, 1994; Riccio, Richardson,
& Ebner, 1984). For example, it is surprising to think that forgetting can
actually result in an increase in responding to untrained stimuli. Although
the memory of conditioning seems surprisingly durable (see Figure 5.1),
subtle forms of forgetting do occur.
It is also true that some forms of conditioning are forgotten more quick-
ly than others. For example, Hendersen (1978) and Thomas (1979) found
that fear inhibition can be forgotten faster than fear excitation (the kind of
memory tested in Figure 5.1). Hendersen (1978) conditioned fear inhibi-
tion to a light (X) using the A+, AX– procedure, where A was paired with
a shock US and the AX compound was not. A tone (B) was also paired
with the shock on other trials. Both 1 day and 35 days later, the light was
then tested for its ability to inhibit fear of the tone. The results are shown
in Figure 5.2. Excitation was retained very well, as in the above example,
but inhibition was markedly reduced; after 35 days, CS X did not inhibit
162  Chapter 5

Figure 5.2  Conditioned inhibi- 0.7


tion is forgotten over time. The
inhibitor (X) lost some of its ability
0.6
to inhibit fear of an excitor (B) 35
days—as opposed to 1 day—after
conditioning. In the “suppression 0.5
index” used here, more fear is

Suppression index
higher on the scale. (After Hender- 0.4
sen, 1978.)

0.3

0.2

0.1

0
B BX B BX
1 day 35 days

fear of CS B. Inhibition thus seems to be more sensitive to the effects of time


than is excitation. The fact that fear inhibition is forgotten more quickly
than fear excitation may explain why fear and anxiety may sometimes seem
to emerge “out of the blue” over time. If an emotional memory is learned
but inhibited and the inhibition fades more quickly than the excitation,
inhibited fears and anxieties may gradually appear in behavior.
Other kinds of learning also seem to be forgotten relatively easily, and
researchers have been able to develop some nice methods to study how
and why organisms do forget (see Spear, 1978, for a review). For instance,
Gordon, Smith, and Katz (1979) trained rats to run from a white compart-
ment to a black compartment in a box during a flashing light to avoid
receiving a shock. (This method is actually an instrumental learning task,
although we will see in Chapters 9 and 10 that Pavlovian learning probably
plays a powerful role.) Four days after the rats had learned to jump from
the white compartment to the black compartment, they received memory
testing—they were returned to the white compartment and tested for their
latency to run to the black compartment. No shocks were presented during
the test. Time to run to the black compartment increased from less than a
second at the end of training to about 10 seconds during testing, four days
later. This increase in latency suggests that a lot of forgetting occurred dur-
ing the 4 days between tests.
The data for the group of rats that forgot is shown at left in Figure
5.3. What makes the experiment interesting, though, is the behavior of
the other groups shown in the figure. These rats received “reminder”
treatments sometime before the final test. The idea was to find a way to
reduce the forgetting that otherwise occurred after 4 days. The reminder
Whatever Happened to Behavior Anyway?   163

Figure 5.3  Forgetting can


10
be reduced by a reminder. All
groups were tested 4 days (96
Median retention test
8
hours) after avoidance learning.
latencies (s)

6 High latencies indicate forget-


ting. There was substantial
4 forgetting in the group that
received no reminder before
2 the test, but if a reminder was
presented either 24 hours or
0 10 minutes before the test,
No 72 24 10
reminder hours hours minutes memory performance was
markedly improved. (After
Reminder-to-test interval
Gordon, Smith, & Katz, 1979.)

consisted of bringing the rats back to the experimental room and putting
them in the white compartment (with the flashing light) for 15 seconds.
Different groups received this treatment at different intervals before the
test. Amazingly, it improved memory performance quite dramatically if
it was performed either 10 minutes or 24 hours before the test (see Fig-
ure 5.3, rightmost bars). Presenting the reminder 72 hours before testing
was not effective, however; presumably, the rats forgot all over again.
The effectiveness of the reminder treatment nonetheless illustrates a very
important point about memory: Forgotten memories can be triggered by
reexposure to part of the original learning situation. This is often called
memory reactivation.
You have undoubtedly had similar experiences, such as remembering
an old boyfriend or girlfriend when, later in life, you got a whiff of their
aftershave or perfume. Or, upon hearing an old song on the radio, you
might have suddenly thought of something you did in the sixth grade. In
the animal lab, if fear learning is forgotten, it can be reactivated by reex-
posure to the original shock US (e.g., Campbell & Jaynes, 1966; Spear &
Parsons, 1976; see Spear, 1978, for a review). These effects all indicate that
forgetting can occur even though information is still stored in the brain
somewhere. When forgetting happens, the information is often still avail-
able, but not accessible or successfully retrieved.

Causes of forgetting
There are at least three classic reasons why forgetting might occur. One
possibility is that the memory trace might literally fade away or “decay”
over time. This potential cause of forgetting is known as trace decay.
At first glance, it seems quite plausible as a cause of forgetting because
it seems difficult to believe that the brain stores memory traces forever
(see Hardt, Nader, & Nadel, 2013). On the other hand, reactivation effects
like the ones we just considered (see Figure 5.3) suggest that memories
are often surprisingly intact even after long periods (Gordon et al., 1979;
164  Chapter 5

Spear & Parsons, 1976). The fact that reminders can often jog forgotten
memories indicates that memories can be forgotten without necessarily
becoming decayed or destroyed. Memory theorists have therefore tended
to emphasize two other reasons people and animals forget. Interestingly,
these other causes of forgetting do not necessarily imply that the memory
decays at all over time.
The next possible cause of forgetting is interference (e.g., McGeoch,
1932; Postman & Underwood, 1973). Put simply, memory for information
learned at one point in time can be hurt when conflicting information is
learned at some other point in time. The conflicting information somehow
interferes with access to the target information. There are two types of in-
terference. If the interfering information is learned before the target informa-
tion is learned, we have proactive interference: The memory interference
works “proactively,” or forward, in time. When the interfering information
is learned after rather than before the target information, we have retroac-
tive interference: Interference that works “retroactively,” or backward, in
time. To illustrate, people can be brought into the lab and given two lists of
words (List 1 and List 2) to memorize. The experimenter can then ask them
to remember either the first or the second list (e.g., Barnes & Underwood,
1959; Briggs, 1954; Postman, Stark, Fraser, 1968). In proactive interference,
List 1 interferes with memory for List 2. In retroactive interference, List 2
interferes with memory for List 1.
In the long run, memories may be forgotten over time because time
permits the accumulation of interference. For example, competing infor-
mation learned during the retention interval could easily cause forgetting
(retroactive interference). Indeed, conflicting information given to people
after they have witnessed crimes or traffic accidents can hurt the accuracy
of eyewitness testimony (e.g., Belli & Loftus, 1996; Loftus, 1979). In addi-
tion, because proactive interference tends to increase over time (e.g., Post-
man et al., 1968), information learned earlier would also increasingly yield
forgetting as time passes. The subject of interference dominated research
on human learning and memory for several decades until roughly the close
of the 1960s (e.g., Postman & Underwood, 1973), and it is still widely seen
as a powerful source of forgetting (e.g., Mensink & Raiijmakers, 1988).
A third major source of forgetting is retrieval failure. The idea here is
that information may remain available in memory, but is forgotten if you
cannot retrieve it. A memory is a little like a book at the library. It might
be in the stacks somewhere, but it will be lost if you do not know where
to find (retrieve) it. To get good memory retrieval, the conditions present
during memory testing need to be as similar as possible to those that were
present during learning. That is, the context must be as similar as possible.
For example, in an experiment that was similar to the one shown in Figure
5.3, Gordon, McCracken, Dess-Beech, and Mowrer (1981) trained rats to
run from white to black compartments during the flashing light to avoid
receiving a mild electric shock. In this case, the training occurred in one of
two distinctive rooms that differed in size, lighting, and odor. Testing then
Whatever Happened to Behavior Anyway?   165

16 Figure 5.4  Forgetting also


happens after a context
14 change, and it can also be
reduced by a reminder. As in
Mean latency, first test trial (s)

12 Figure 5.3, high latencies sug-


gest forgetting; the context
10 switch caused forgetting, but
it was reduced with a reminder
8 like the one used to amelio-
rate forgetting over time (see
6
Figure 5.3). (After Gordon et
al., 1981.)
4

0
Context No Switch
switch switch plus
reminder

occurred 24 hours later in either the same room (Room 1) or the other room
(Room 2). The results are shown in Figure 5.4. There was a clear loss in
performance when testing happened in Room 2. The authors suggested that
the rats failed to retrieve the pertinent information because of the difference
in context. In either rats or humans, performance during memory tests can
be worse when the test is conducted in a context that is different from the
one in which learning originally occurred (see Bouton, 1993, for a review).
The third group of rats shown in Figure 5.4 received a reminder treat-
ment similar to the one we discussed earlier. A few minutes before the test
in the “different” context (Room 2), rats in this group were placed for 15
seconds in a white box similar to the avoidance apparatus’ start box. As
Figure 5.4 suggests, this treatment once again improved performance dur-
ing the test. The point is that forgetting that is caused by either the passage
of time (see Figure 5.3) or a context change (see Figure 5.4) can be alleviated
by a reminder treatment. It is exactly the kind of result that suggests that
forgetting—in either case—occurs because of retrieval failure.
As mentioned above, similar effects have been shown in humans. In a
famous experiment, Godden and Baddeley (1975) showed that scuba divers
who learned a word list while either on land or underwater remembered
the list better when they were tested in the same context—that is, they
remembered best while either dry or wet (respectively), but not vice versa.
Smith (1979) also found that students remember word lists better in the
physical room where the lists were first learned; a switch to another room
reduced that memory. Here, forgetting was reduced when the participants
were instructed to think about the room in which they had learned the list.
It is important to note, however, that context-dependent memory effects
are not always obtained in humans (see Smith, 1988; Smith & Vela, 2001,
166  Chapter 5

for more information). This turns out to be true in animals, too. Simple,
classically conditioned excitatory conditioning very often transfers to new
contexts, just as it is often remembered well over time (see Bouton, 1993;
Rosas, Todd, & Bouton, 2013).
In summary, the passage of time can lead to forgetting for several rea-
sons. The target memory might decay, similar information learned before or
after the target memory is learned might cause interference, or the memory
may become more difficult to retrieve. In the last case, the passage of time
may lead to changes in the internal or external context that leads to a
mismatch between the learning and the testing contexts (Bouton, 1993;
Bouton, Nelson, & Rosas, 1999; Gordon et al., 1981; Spear, 1978). Of these
possible causes of forgetting, interference and retrieval failure have often
been considered the most prevalent and important. In fact, retrieval failure
may play a role in many situations in which learned information or knowl-
edge is not evident in behavior (e.g., Miller, Kasprow, & Schachtman, 1986).

Remembering, forgetting, and extinction


One situation where interference and retrieval may be important is extinc-
tion (Bouton, 1991, 1993, 2004). Although every model of conditioning we
discussed in Chapter 4 has an explanation of extinction, it is not clear that any
of these models get it quite right. For example, the Rescorla-Wagner model
assumes that extinction trials (trials with no US) are basically conditioning
trials with a US of zero intensity. Because of that, extinction should eventu-
ally make the CS’s associative strength return to zero (see Figure 4.3). That
is, the model assumes that extinction destroys the original learning. This
description is a bit simplified, as we will see in a moment. But there is a pos-
sible problem here. You already know that extinction does not destroy the
original learning: As mentioned in Chapter 2, Pavlov discovered spontaneous
recovery, the phenomenon in which the response returns when time passes
after extinction. For spontaneous recovery to occur, the original learning
could not have been destroyed. If it had, how could behavior ever return?
Pavlov suggested that extinction creates inhibition, which he assumed was
more labile, or fragile, than excitation. Some of the models that followed
the Rescorla-Wagner model, such as the Pearce-Hall model and SOP theory,
went on to accept the idea that the CS acquires inhibition during extinction
(see below). But they did not try to explain why the passage of time causes
spontaneous recovery. Theories of conditioning do not explain one of the
world’s most famous extinction effects.
I have been doing research on extinction for a number of years and
have suggested that one way to go forward is to combine theories of con-
ditioning with the memory principles that were just discussed above (e.g.,
Bouton, 1991, 1993, 2004). Spontaneous recovery is just one of several effects
indicating that extinction does not destroy the original learning. There are
also important context effects. For example, Figure 5.5A illustrates a phe-
nomenon known as the renewal effect (e.g., Bouton & King, 1983). Rats
Whatever Happened to Behavior Anyway?   167

(A) (B)
Group Phase 1 Phase 2 Test 0.5
Ext-A A: T — Shock A: T — No shock A: T? Ext-A
0.4

Suppression ratio
Ext-B A: T — Shock B: T — No shock A: T? Ext-B
NE A: T — Shock ——— A: T? 0.3

0.2
Figure 5.5  (A) Design of the experiment by
NE
Bouton and King (1983). (B) Results during 0.1
extinction (left) and testing (right). Suppres-
sion was “renewed” when the CS was tested 0
2 4 6 8 10 1 2 3 4
in the conditioning context (Context A) after
Extinction Test
extinction in Context B. (After Bouton and
Two-trial blocks
King, 1983.)

first received CS-shock pairings in one context, Context A. (The contexts


were separate Skinner boxes located in different rooms of the lab; they
had different visual, tactile, and odor features.) In a second phase, the rats
then received extinction (trials on which the CS was presented without
shock) in either the same context (Group Ext-A, for “extinction in Context
A”) or in a second context (Group Ext-B, for “extinction in Context B”).
The results (Figure 5.5B) indicated that suppression evoked by the CS
gradually extinguished during this extinction phase. It was interesting,
however, that even though Group Ext-B was receiving the CS in a context
that was different from the one in which conditioning had occurred, this
change in context had no effect on the rate of extinction. Groups Ext-A and
Ext-B showed the same amount of suppression throughout extinction. This
finding is consistent with what I mentioned above—simple conditioned
excitation is often remembered well regardless of the context in which the
CS is tested, just as it is remembered well over time (see Figure 5.1).
But now consider the effects of changing the context after extinction. In
the final phase, all the rats were returned to Context A and tested there with
the CS. Doing so meant that Group Ext-B now received the CS in a context
that was different from the one in which extinction had occurred; in fact, it
was the original conditioning context. As Figure 5.5B indicates, the return
to Context A caused a strong recovery—or “renewal”—of suppression in
this group. Extinction in Context B had not destroyed the original learning,
although Group Ext-B showed less suppression than a control group that
had received no extinction (Group NE). This experiment, along with others
that followed, demonstrates that extinction performance can be relatively
specific to the context in which it is learned (see Bouton, 2002, 2004; Vurbic
& Bouton, 2014, for reviews). One can also create renewal by just remov-
ing the subject from the context of extinction, without returning it to the
original conditioning context (e.g., Bouton & Ricker, 1994). Extinction per-
formance thus depends at least in part on being in the extinction context.
168  Chapter 5

In principle, the renewal effect illustrated in Figure 5.5 is actually consis-


tent with the Rescorla-Wagner model. If you think about it, Context B could
become a conditioned inhibitor during extinction: It is combined with the
CS on no-US trials in extinction, and that could give the context inhibitory
associative strength (see Chapter 4, and especially Figure 4.4). Bouton and
King (1983) therefore tested Context B for inhibition, and surprisingly, it
failed all the tests (see also Bouton & Swartzentruber, 1986). This result,
combined with other evidence, suggests that even though responding to the
CS is suppressed in the extinction context, the context does not work through
simple inhibitory conditioning (see Bouton, 1991, 1993, for reviews). What,
then, is going on? Perhaps the context serves as a cue that retrieves something
like the meaning of the CS; that is, it retrieves the CS’s current relationship
with shock. (For example, after extinction, the CS signals no shock.) And if
extinction does not destroy the original CS-US association, an extinguished
CS actually has two available meanings: “The tone means shock” as well as
“the tone means no shock.” The CS is thus ambiguous, like an ambiguous
word. Further like an ambiguous word, its current meaning depends on the cur-
rent context. To illustrate, your response to someone shouting “Fire!” might
be very different when you are in a crowded movie theater as opposed to
a shooting gallery at the county fair. As effects like the renewal effect sug-
gest, responding to an extinguished CS can also depend on the
context in which it occurs.
To put it all together, Figure 5.6 illustrates some plausible
US
associations that might exist after responding to a CS has been
extinguished. During conditioning, the CS is associated with
the US, and as we have seen before, the CS therefore evokes
T a response because it activates the US node or representation.
Cx
During extinction, this association is not destroyed. Instead,
the CS acquires a second, inhibitory association with the US.
Figure 5.6  Associations
As noted above, this idea is actually built into both the Pearce-
that a tone CS (T) might
have after extinction. In
Hall model and SOP theory, which were both introduced in
conditioning, the CS was Chapter 4. For example, in SOP, the CS activates the US node
associated with a US; in to A2 on each trial of extinction; you may remember that this
extinction, a new inhibitory condition is exactly the one that allows a CS to acquire inhi-
association was formed (–l). bition! The new inhibitory association is gradually acquired
Activation of this inhibi- in extinction, and it gradually suppresses the conditioned
tory association suppresses response by inhibiting the US node, which would otherwise
activation of the US node; be activated by the original excitatory association. After ex-
to activate the inhibition, tinction, the CS thus has two associations or meanings: one
however, one must have that excites the US node and one that inhibits it.
input from the extinction
The context’s role is also presented in Figure 5.6. Notice
context as well as the tone.
Outside the extinction
that the context node connects with another node on the CS’s
context, inhibition is not inhibitory association. This second node activates the final
activated, and responding inhibitory association, but only if both the CS and the right
is expected to return. (After context are present. In other words, when the CS is presented
Bouton and Nelson, 1994.) in the extinction context, the inhibitory link will be activated,
Whatever Happened to Behavior Anyway?   169

and the animal will not respond to the CS. But when the CS is presented
outside the extinction context, the inhibitory link will not be activated,
and renewed responding (the renewal effect) will occur. After extinction,
conditioned responding is always “on” unless the extinction context is
present to help switch it “off.” Notice that the excitatory association does
not require a context for activation. This finding is consistent with the
common result, illustrated in Figure 5.5B, that conditioning is less hurt by
changing the context than extinction performance is.
Another point is that “context” is provided by many types of back-
ground cues (e.g., Bouton, 1993, 2010; Bouton & Swartzentruber, 1991). One
example is the room or Skinner box in which the experiment takes place
(the most common definition of “context”). Another example, however, is
the internal “state” produced by drugs. When fear extinction is conducted
in the presence of a state provided by an anxiety-reducing drug (e.g., Va-
lium, Librium, or alcohol), a renewal effect occurs when a rat is tested
without the drug (Bouton, Kenney, & Rosengard, 1990; Cunningham, 1979;
Lattal, 2007), and the animal is afraid of the CS again. Thus, a drug can also
play the role of context. In addition, a context change may be created by the
passage of time. As time goes by, certain internal body states and external
stimuli are likely to change. Extinction may thus be specific to the context
of a particular time. According to this idea, spontaneous recovery—the
recovery of responding that occurs when time elapses after extinction—is
the renewal effect that happens when the CS is tested outside of extinction’s
temporal context (e.g., Bouton, 1993). Thus, both spontaneous recovery
and renewal occur because the animal fails to retrieve inhibition outside
an extinction context. Consistent with this idea, both effects are reduced if
the animal is given a reminder treatment that reminds it of extinction just
before the test (Figure 5.7; Brooks & Bouton, 1993, 1994). There is a clear

(A) Spontaneous recovery (B) Renewal effect Figure 5.7  Spontaneous recovery
(Temporal context) (Physical context)
(A) and the renewal effect (B) can
120 100 both be reduced by presentation
of a cue that reminds the animal
Percentage initial performance

Percentage initial performance

100 of extinction. Analogous to what


80
we saw in Figures 5.3 and 5.4,
80
60 reminders reduce the failure to
60
retrieve extinction caused by either
the passage of time or a context
40
40 change. Amount of responding
in the test is expressed as the
20 percentage of responding that was
20
achieved at the end of condition-
0 0 ing. (Data from Brooks & Bouton,
1993, 1994; figure after Bouton,
Control cue Extinction cue 1994a.)
170  Chapter 5

parallel with experiments on the effects of remembering and forgetting


described in Figures 5.3 and 5.4.
These ideas might have implications for clinical psychologists because
extinction is thought to be the basis of exposure therapies (see Chapter
2) that are designed to reduce certain behavioral problems like anxiety
disorders that may be caused by conditioning. The renewal effect and
spontaneous recovery may both provide reasons for relapse after therapy
(e.g., Bouton, 1988, 2002; Bouton & Swartzentruber, 1991). Thus, the ef-
fects of extinction therapy may tend to diminish if the context is changed
after a therapy session or if time is allowed to elapse. Therapists should be
careful about performing extinction in a unique context, like an unusual
office or while the patient is under the influence of a drug. Instead, they
should try to ensure that their treatments will be retrieved in contexts
where the original problem is most troublesome to their clients, either by
conducting exposure in a similar context or by arming them with skills
or cues that will help them retrieve extinction when conditions make
relapse likely (for more discussion of these issues, see Bouton, 2001, 2002;
Craske et al., 2012, 2014).
These ideas are consistent with some interesting clinical research (see
Vervliet, Craske, & Hermans, 2013, for a review). Fears in humans can
indeed be reduced by exposure to the feared stimulus, but the fear may
be renewed in a different context. For example, Mystkowski, Craske, and
Echiverri (2002) gave students who reported being afraid of spiders a ses-
sion in which the students were exposed to a tarantula in a context provid-
ed by either a room in a lab or a patio outdoors. This reduced their fear. The
students were later tested in the same or different contexts; they reported
more distress when they were tested in a different context, suggesting that
they experienced a renewal of fear (see also Mystkowski, Craske, Echiverri,
& Labus, 2006). Renewal has also been reported in humans given extinction
exposure to alcohol cues. Collins and Brandon (2002) gave undergraduate
social drinkers several exposures to the sight and odor of beer in a distinc-
tive room. These repeated exposures caused a reduction in their salivation
and urge to drink in the presence of the beer cues. But when tested in
a different room both responses (salivation and the urge to drink) were
renewed. Interestingly, this renewal effect was reduced if the participants
were given a cue that reminded them of extinction—as in the animal ex-
periments summarized in Figure 5.7. Renewal has also been reported after
the extinction of cigarette cues (Thewissen, Snijders, Havermans, van den
Hout, & Jansen, 2006). And, in an experiment that was mentioned way back
in Chapter 1, Crombag and Shaham (2002) found that rats that pressed a
lever for an intravenous mixture of heroin and cocaine in one context and
then had the behavior extinguished in a second context showed a powerful
renewal of lever pressing when they were returned to the original context
(see also Bossert, Liu, Lu, & Shaham, 2004; Hamlin, Newby, & McNally,
2007). All these results suggest that the renewal effect is clinically relevant
and may be obtained under a range of circumstances.
Whatever Happened to Behavior Anyway?   171

Other effects besides spontaneous recovery and renewal indicate that


extinction does not destroy the original learning (e.g., see Bouton, 2002,
2004). For example, if the US is presented on its own after extinction, it
can cause responding to the CS to return when the CS is tested later (e.g.,
Bouton, 1984; Bouton & Bolles, 1979b; Bouton & King, 1983; Delamater,
1997; Rescorla & Heth, 1975). This effect is called reinstatement. Rein-
statement occurs because the new US presentations condition the context,
and this conditioning triggers fear of the CS when the CS is tested in the
same context (e.g., Bouton, 1984; Bouton & Bolles, 1979b; Bouton & King,
1983). Once again, extinction does not destroy the original learning, and
the response one observes depends on the animal’s knowledge about the
context. Still another effect is rapid reacquisition (e.g., Napier, Macrae, &
Kehoe, 1992; Ricker & Bouton, 1996). In this case, the conditioned response
can sometimes return very quickly when CS-US pairings are repeated after
extinction. One explanation is that recent conditioning trials are another
part of the context in which the CS was originally conditioned. Resuming
CS-US pairings after extinction thus returns the organism to the condition-
ing context—and causes a renewal effect (e.g., Bouton, Woods, & Pineno,
2004; Ricker & Bouton, 1996). For a more complete explanation of these ef-
fects and discussion of their applications to clinical phenomena, see Bouton
(2002) and Vervliet et al. (2013).
In summary, I have just described an approach to extinction that com-
bines conditioning theory with two of the memory mechanisms described
earlier in this chapter. First, the learning involved in extinction reduces or
inhibits conditioned performance by creating a form of retroactive interfer-
ence. Second, that interference is controlled by retrieval by the extinction
context. In the proper context, an interfering memory is retrieved; in the
wrong context, it is not. These characteristics of extinction may have many
practical implications for understanding relapse after exposure therapy.

Other examples of context, ambiguity, and interference


It is worth noting that a CS may acquire more than one association in a
number of conditioning paradigms. Table 5.1 lists several of these ef-
fects. For example, consider counterconditioning. In this procedure, a
CS is associated with one US (e.g., shock) in one phase (Phase 1) and then
a different US (e.g., food) in a second phase (Phase 2). As in extinction,
learning from the second phase interferes retroactively with performance
from the first phase. Often, Phase 1 learning also interferes with Phase 2
performance as the Phase 2 pairings proceed. Counterconditioning was
proposed as the theoretical idea behind “systematic desensitization” of
simple phobias (Wolpe, 1958). In systematic desensitization, patients are
gradually exposed to CSs that they are afraid of while the patients are in
a state of relaxation. This method can be a very effective way to reduce
anxiety or fear. The idea is to substitute a new response (e.g., relaxation) for
an old one (e.g., anxiety) by pairing the stimulus with the new response.
Like extinction, however, context and ambiguity might also be involved.
172  Chapter 5

TABLE 5.1  Some paradigms involving ambiguity, interference, and retrieval


Paradigm Phase 1 Phase 2
Extinction CS+ CS–
Counter-conditioning
Aversive–appetitive CS — Shock CS — Food
Appetitive–aversive CS — Food CS — Shock
Verbal interference List 1 List 2
Latent inhibition CS– CS+
Hall-Pearce negative transfer CS — Shock CS — SHOCK!!!
Source: After Bouton, 1993.

Consistent with this idea, spontaneous recovery and renewal effects


both occur after counterconditioning (Bouton & Peck, 1992; Peck & Bouton,
1990). As in extinction, the Phase 2 treatment does not necessarily destroy
the association learned in Phase 1. The first association remains available
and returns to performance with context change over the passage of time.
The other paradigms listed in Table 5.1 are also affected by context and
time. Spontaneous recovery and context effects—in particular renewal ef-
fects—have been observed in all of them (e.g., Bouton, 1993). In fact, latent
inhibition is worth considering again in this light. In Chapter 4, we saw
that latent inhibition is often explained by assuming that CS-processing
becomes habituated during the first phase. However, the role of habitu-
ation may be overrated. We saw that latent inhibition is reduced when
the context is changed after preexposure to the CS, whereas habituation
usually is not affected (e.g., Hall & Channell, 1985). Such results suggest
that habituation and latent inhibition are not always connected; in a new
context, the organism seems to recognize the CS (and thus continues to
show habituation to it) but might not remember what it means. Instead
of only habituating, the animal might learn that the CS means nothing of
consequence (e.g., Bouton, 1993; Westbrook & Bouton, 2010; see also Hall
& Rodriguez, 2010). This type of learning may then depend on the context.
Gordon and Weaver (1989) found that when latent inhibition was reduced
by a context switch, it could be restored if a retrieval cue (a noise that had
been featured during preexposure) was presented before the test. Thus,
latent inhibition looked like it was forgotten with a context switch. Other
evidence suggests that latent inhibition is also forgotten over time (e.g.,
Aguado, Symonds, & Hall, 1994; see also Westbrook, Jones, Bailey, & Har-
ris, 2000). And interestingly, after conditioning has occurred, a return to
the context in which preexposure had occurred can suppress performance,
as if a memory of preexposure had been renewed (e.g., Westbrook et al.,
2000). Thus, preexposure to the CS does not necessarily cause a failure
to learn about the CS later (e.g., Aguado et al., 1994; Kramer & Roberts,
Whatever Happened to Behavior Anyway?   173

1984); instead, it might interfere with the expression of that knowledge in


performance. Note that this idea is the same as the one we just considered
for extinction. Both effects result from an interference effect that is lost with
context change or the passage of time.
One challenge for this explanation of latent inhibition is that it has never
been clear what the animal actually learns about the CS during the preex-
posure phase. We know that latent inhibition is not the same as conditioned
inhibition (see Chapter 3), so it would be wrong to think that a preexposed
CS predicts “no US” in the same way that a true inhibitor does. Maybe
animals actively seek reinforcers that will satisfy their current needs; for
example, hungry rats might seek food, whereas thirsty rats seek water. If
a rat is hungry or thirsty, a CS preexposed without the reinforcer might be
encoded as irrelevant for finding food or water (Killcross & Balleine, 1996).
The animal might learn that the CS is not useful for predicting a particular
goal. Alternatively, the animal might learn that the CS is just an intermittent
feature of a particular context (see Gluck & Myers, 1993).
Regardless of what is actually learned in latent inhibition, the main
point is that the interference paradigms described in Table 5.1 have a great
deal in common. Information from both phases can be learned and retained,
and performance is determined by the extent to which either is retrieved.
Interference and memory retrieval are important determinants of perfor-
mance in classical conditioning.

Can memories be erased?


All this talk about interference and retrieval failure seems to leave little role
for the decay or erasure of memory. Recently, however, there has been a
great deal of interest in the possibility that memories can be erased under
some conditions. To understand the idea, it is useful to note that newly
learned memories are unstable and easy to disrupt for a few hours before
the brain stores them in a stable form (e.g., McGaugh, 2000). New memories
need to be “consolidated.” The consolidation process is ultimately thought
to require the synthesis of new proteins in the brain. So, if an experimenter
administers a drug that blocks protein synthesis within a few hours after
learning, memory consolidation can be prevented; the animal will now
fail to remember when its memory is tested the next day (e.g., Schafe &
LeDoux, 2000). Other drugs besides protein synthesis inhibitors can also
interfere with memory consolidation (e.g., McGaugh, 2004).
Interestingly, even a consolidated memory can become briefly unstable
again when it is reactivated or retrieved (e.g., Misanin, Miller, & Lewis,
1968). A reactivated memory then needs to be returned to a stable form
through reconsolidation. Consider an experiment by Nader, Schafe, and
LeDoux (2000). Rats first received a single fear conditioning trial in which a
tone CS was paired with a shock US. The next day, after the CS-US associa-
tion had been consolidated, the rats were put in a different context (box)
and given a single presentation of the CS. The idea was to reactivate the
memory learned the previous day. Immediately after that reactivation trial,
174  Chapter 5

Figure 5.8  Reactivated memories 100


need to be reconsolidated. Rats
froze less to a CS during testing if 80
they were previously given a protein
synthesis inhibitor (anisomycin) Vehicle

Freezing (%)
60
soon after the memory had been
reactivated than if they had not
been given anisomycin. Anisomycin 40
prevented memory reconsolidation
Anisomycin
after reactivation. (After Nader et 20
al., 2000.)
0
1 2 3
Trial

one group received a protein synthesis inhibitor (anisomycin) delivered


directly to the brain (by means of a small tube or “cannula” implanted in
the basolateral area of the amygdala). A second group received only the
fluid (or “vehicle”) that the anisomycin was dissolved in. Then, on the third
day, Nader et al. (2000) presented the CS again and tested the rats’ fear of
it. Figure 5.8 shows freezing to the CS during the test. As you can see, the
rats given the protein synthesis inhibitor after reactivation froze less than
the other rats, as if their memory had been impaired. Other experiments
showed that the anisomycin needed to be presented within a few hours
after reactivation; otherwise, there was no impairment. Nader et al. sug-
gested that the anisomycin prevented the memory from being reconsoli-
dated. In principle, a memory might thus be “erased” if it is reactivated
but not allowed to return to a stable form.
The prevention of reconsolidation has now been shown with other
methods and other drugs (see LeDoux, 2015; Nader & Hardt, 2009, for
reviews). It has also been demonstrated in humans. For example, Kindt,
Soeter, and Vervliet (2009) brought people into the lab and gave them sev-
eral trials in which a CS+ was paired with mild shock delivered through
electrodes attached to the wrist. A second CS (a CS-) was presented equally
often, but never paired with shock. Then, on the next day, the participants
returned to the lab and received one of three treatments. One group re-
ceived a single presentation of the CS+ without shock, but before this re-
activation trial, they swallowed a pill containing propranolol, a drug that
blocks certain brain receptors (β-adrenergic receptors) and can interfere
with memory consolidation (e.g., McGaugh, 2004). (Other experiments
have shown that a pill swallowed after the reactivation trial has the same
effect, Soeter & Kindt, 2012). A second group received the reactivation trial
but was not given propranolol (they swallowed a placebo pill instead).
The final group received propranolol, but was not presented with the CS+
(and thus had no memory reactivation). A day later, all the participants
returned to the lab, and their fear of the CS was measured. To accomplish
Whatever Happened to Behavior Anyway?   175

this, Kindt et al. (2009) attached recording electrodes to the skin near the
eye and measured startled blinking in response to a brief noise burst that
was presented during the CS+, during the CS-, and in the absence of either
stimulus. If a CS arouses fear, it will potentiate the startle response (see
“other forms of modulation” below).
The test results of two of the groups are summarized in Figure 5.9. (The
results of the two control groups—placebo with reactivation and propranol
without reactivation—were essentially the same, so only the placebo with
reactivation group is shown.) Figure 5.9A and C show startle responding

Startle response Expectation


(A) Propranolol with reactivation (B) Propranolol with reactivation
Acquisition Extinction/Test Acquisition Extinction/Test
600 100
80
Fear-potentiated startle

60
40
400 Expectancy
20
0
–20
200 –40
–60
–80
0 –100
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10
Trials Trials

(C) Placebo with reactivation (D) Placebo with reactivation


Acquisition Extinction/Test Acquisition Extinction/Test
600 100
80
Fear-potentiated startle

60
40
400
Expectancy

20
0
–20
200 –40
–60
–80
0 –100
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10
Trials Trials

CS+ CS–

Figure 5.9  Interfering with memory reconsolidation in humans. If human


participants were given a propranolol pill along with memory reactivation, they
did not appear to be afraid when the CS was tested a day later (A). They still
reported expecting the US (B), however. (C, D) show a control group that had its
memory reactivated with a placebo. (After Kindt et al., 2009.)
176  Chapter 5

in the two groups during the conditioning trials and during the extinction
test trials that were conducted a day after the reconsolidation treatment.
Notice that the group given reactivation plus propranolol showed very
little startle to the CS+ during the tests (A). It was as if the treatment had
wiped out their fear of the CS. The control group still exhibited fear of the
tone (C). There is similarity between these results and those of Nader et
al. (2000). Also notice Figure 5.9B and D, however, which show other data
that Kindt et al. collected. On each trial, the participants rated their actual
expectation that the shock would be presented. During the conditioning
trials, you can see that the expectancy ratings steadily increased with CS+
and decreased with CS –. However, the CS+ and propranolol combination
that eliminated fear (A) had no effect on the expectancy of the US (B).
Thus, preventing reconsolidation affected the emotional properties of the
CS. This finding is consistent with the idea, mentioned in our discussion
of AESOP theory in Chapter 4, that conditioning involves separate sensory
and emotional associations. Apparently, preventing reconsolidation mainly
affected the emotional link.
In other tests that are not shown, Kindt et al. (2009) presented three shocks
after extinction testing to see whether they “reinstated” fear after it had
been extinguished (see above). In contrast to the other groups, the blocked-
reconsolidation group showed no evidence of reinstatement at all. The re-
consolidation treatment did not just reduce fear of the CS; it also made fear
more difficult to reinstate. Other investigators have run other tests like this
one (including tests of the effects of context change and the passage of time),
and there is often surprisingly little evidence that fear of the CS can be made
to return after reconsolidation has been blocked (e.g., Duvarci & Nader, 2004).
Research on reconsolidation has many implications for understanding
memory. For example, it suggests that any time a memory is retrieved, it
might be modifiable, at least briefly, before it is stored again (e.g., Nader,
2003; Nader & Hardt, 2009). This possibility seems consistent with the
fact that human memories can become quite inaccurate with repeated re-
membering over time (e.g., Bartlett, 1932). And from a clinical standpoint,
the phenomenon suggests that if one reactivates a patient’s memory for
a trauma, it might be possible to eliminate the memory by administering
a drug that prevents reconsolidation. Unfortunately, tests of this idea in
patients with anxiety disorders have not met with much success to date
(e.g., Wood et al., 2015).
Reconsolidation treatments are not always successful at impairing mem-
ory. For one thing, lab experiments suggest that preventing reconsolidation
seems to eliminate fear mainly when the memory is relatively weak or rela-
tively old (e.g., Wang, de Oliveira Alvares, & Nader, 2009). Second, a simple
reminder before drug administration is not enough. In the Kindt et al. (2009)
method (see Figure 5.9), the shock electrodes need to be attached to the wrist
when the CS is presented during reactivation—so that the participant knows
that something bad can really happen—or else the drug does not impair
the memory (Sevenster, Beckers, & Kindt, 2012). The participant needs to
Whatever Happened to Behavior Anyway?   177

truly expect the shock during reactivation; there needs to be real surprise
or prediction error. Third, presenting the CS during reactivation does not
just reactivate the memory; it can introduce new learning. For example, the
reactivating trial is usually an extinction trial (but see, e.g., Duvarci & Nader,
2004). Therefore, the subject will begin to learn extinction if the CS is pre-
sented more than once or twice (e.g., Eisenberg, Kobilo, Berman, & Dudai,
2003; Lee, Milton, & Everitt, 2006; Power, Berlau, McGaugh, & Steward, 2006;
Suzuki et al., 2004). This can make things tricky. If the drug is presented after
a single reactivation trial, it can reduce fear by preventing reconsolidation.
But if the drug is presented after extinction has begun, it can now increase
fear by interfering with the fear-reducing effects of extinction learning (Lee et
al., 2006)! Finally, memories that seem impaired by reconsolidation blockage
can sometimes return over time (e.g., Anokhin, Tiunova, & Rose, 2002; Lattal
& Abel, 2004; Mactutus, Riccio, & Ferek, 1979; Power, Berlau, McGaugh, &
Steward, 2006) or return when the drug is presented again just before the test
(Gisquet-Verrier, Lynch, Cutolo, Toledano, Ulmen, Jasnow, & Riccio, 2015).
The fact that the memories can return this way suggests that reconsolidation
treatments might cause some form of retrieval failure instead of real memory
erasure (e.g., Miller & Matzel, 2006). New results are being produced all the
time (e.g., Hardt, Wang, & Nader, 2009; LeDoux, 2015), but at this point, it is
probably safe to conclude that true memory “erasure” is trickier to achieve
than it might seem.

Interim summary
Classical conditioning can have long-lasting effects on behavior, although
certain types of information (about the CS or the US) may be forgotten over
time. When forgetting occurs, it is often caused by either interference or
retrieval failure. For a complete understanding of Pavlovian learning, inter-
ference and retrieval are worth adding to the various learning mechanisms
we discussed in Chapter 4. Extinction is a good illustration. The learning of
something new during extinction causes retroactive interference—the first-
learned association is not destroyed but becomes less accessible, and per-
formance goes away. But extinction (and the interference it causes) seems
especially dependent on the context for retrieval. So, if the CS is presented
in a different context, the original CR can recover or return (spontaneous
recovery or renewal). The same principles may also work in other interfer-
ence paradigms besides extinction. Interference and retrieval are important
processes that influence whether the CR is observed. Memory can also be
affected by consolidation and reconsolidation processes. A blending of
memory and conditioning concepts helps us further understand learn-
ing—and how learning is translated into performance.

The Modulation of Behavior


We have seen that the context is important in controlling responding to the
CS after extinction and other interference paradigms. If the extinguished
178  Chapter 5

CS is presented in the context where extinction was learned, extinction


performance (i.e., not responding to the CS) is observed. But if the CS is
presented outside the extinction context, responding may be renewed. One
way to look at this state of affairs is that the context modulates respond-
ing to the CS. After extinction, it appears to turn responding to the CS off.
Interestingly, it does not seem to work as a simple inhibitor, that is, a CS
with a negative V value or inhibitory association with the US. Researchers
have begun to realize that modulation effects like this may operate quite
often in Pavlovian learning.

Occasion setting
In the early 1980s, several findings began to suggest that a CS can influence
behavior in ways that are not captured by theories like the ones described
in Chapter 4 (e.g., Ross & Holland, 1981; see early reviews by Holland,
1985; Jenkins, 1985; Rescorla, 1985). Those theories are extremely good at
explaining how a CS can enter into associations with a US, but we now
know that a CS sometimes works in a way that does not depend on its
direct association with the US. To use the language promoted by Peter
Holland (see also Moore, Newman, & Glasgow, 1969), who borrowed Skin-
ner’s description of how cues influence operant behavior (Skinner, 1938;
see Chapters 1 and 7), a CS can sometimes modulate responding to another
CS by “setting the occasion for” the conditioned response. When it does,
the CS is known as an occasion setter.
Simply put, you can think of an occasion setter as a cue that provides
information about whether another CS will be paired with a US. Research on
occasion setting has usually focused on the discrimination procedures
shown in Table 5.2. In the feature-positive discrimination, the subject
receives a mixture of trials in which a CS (e.g., a tone) is presented with a
US (“+”) and trials in which it is presented without the US (“–”) (Figure
5.10A). A second CS (e.g., a light) is set up so that it precedes the tone on
the positive trials. Its presence allows the animal to learn to respond only
on the positive trials. The feature-negative discrimination is the logical
reverse. Here again, there are positive and negative trials, but this time the
light signals the negative trials, the ones in which the US does not occur
(Figure 5.10B). In either type of discrimination, the light is called a fea-
ture stimulus, and the tone is called the target stimulus. There is some
method to the madness in the labeling. When the feature (the light) signals
positive trials, we have a feature-positive discrimination. When the feature
signals the negative trials, we have a feature-negative discrimination. The
target is a “target” in the sense that it is the focus of the experimenter’s

TABLE 5.2  Procedures that generate occasion setting


Feature–positive discrimination L → T+, T–
Feature–negative discrimination T+, L → T–
Whatever Happened to Behavior Anyway?   179

(A) Feature-positive discrimination Figure 5.10  Responding to


the tone CS during condi-
L T+ tioning in a feature-positive
discrimination (A) and a
Responding to T

feature-negative discrimina-
tion (B).

T–

(B) Feature-negative discrimination

T+
Responding to T

L T–

Trials

attention—we want to know whether or not the animal responds to it on


positive and negative trials.
Animals learn these discriminations easily—that is, they quickly learn
to respond only on the positive trials (see Figure 5.10). This should not real-
ly surprise you. Discriminations like this are easy to explain by the models
described in Chapter 4. Let us think about how they do it. “Feature-negative
discrimination” is just another name for Pavlov’s conditioned inhibition
procedure. As we saw in Chapter 4, in the Rescorla-Wagner model (for
example), the animal should associate the target tone CS with the US on
the positive trials (i.e., the tone should gain a positive V value). The animal
should also learn inhibition to the light CS on the negative trials, when the
light is combined with the tone and presented without the US. (The light
should gain a negative V value.) The resulting state of affairs is illustrated
in the lower left panel of Figure 5.11, which shows tone and light nodes
with their corresponding excitatory and inhibitory associations with the
US. When the tone target is presented alone, the subject responds because
the tone node excites the US node. But when the tone is presented with the
light feature, the feature’s inhibitory association inhibits activation of the
US node; therefore, the animal will not respond. That is all it takes to learn
a feature-negative discrimination “the Rescorla-Wagner way.”
The feature-positive discrimination is also reasonably simple. Accord-
ing to the Rescorla-Wagner model, the target tone would gain some as-
180  Chapter 5

Rescorla-Wagner Occasion setting

L US L US

Feature-
positive

T T

L US L US

Feature-
negative

T T

Figure 5.11  Associations that might form during feature-positive and feature-
negative discriminations learned the Rescorla-Wagner way or the occasion-
setting way, where L is the feature CS (a light) and T is the target stimulus (e.g.,
a tone). Inhibition is indicated by a blocked line (–l).

sociative strength on the positive trials, but two factors work against this.
First, the target loses some associative strength every time it occurs without
the US on the negative trials. Second—and more important—on positive
trials, its boost in associative strength must be shared with the feature light
CS, which is also present. Because the feature is never presented without
the US, it is a more informative predictor of the US. In Rescorla-Wagner
terms, the light’s associative strength will never decline, and it will block
any possible boost in strength to the target CS on the positive trials. There-
fore, after feature-positive discrimination training, the feature—and not
the target—should have a strong association with the US. This state of
affairs is illustrated in the upper left panel of Figure 5.11. When the tone
is presented alone, there is no association with the US and therefore no
conditioned responding. In contrast, when the tone is presented with the
light, the light activates the US node, and this causes conditioned respond-
ing. The successful “solution” of the feature-positive discrimination occurs
because the animal simply associates the US with the light. That is all there
is to the Rescorla-Wagner way. It is a powerful and simple explanation of
the feature-positive discrimination.
Unfortunately, learning does not always work this way. Instead, some-
times the light feature appears to influence responding by modulating
the target’s association with the US—as illustrated in the two panels on
the right side of Figure 5.11. In these cases, the light feature is more than
Whatever Happened to Behavior Anyway?   181

a simple excitor or inhibitor—it does not merely have an excitatory or


inhibitory association with the US. Instead, it theoretically works by influenc-
ing the association between the target tone and the US. In the feature-positive
discrimination (see Figure 5.11, top right), the light feature activates the
tone’s whole association with the US. If it does so, it is a positive occasion
setter. In the feature-negative discrimination (see Figure 5.11, lower right),
the light can inhibit the tone’s association with the US. If it does that, it is
a negative occasion setter. In either case, the light feature “sets the oc-
casion” for the target-US association.

Three properties of occasion setters


How in the world do we know all this? One way is to look carefully at the
form of the response that develops as the discrimination is learned. Ross
and Holland (1981) took advantage of an earlier discovery by Holland (e.g.,
1977) that will be discussed in more detail later in this chapter. Holland
found that rats respond in different ways to light and tone CSs when these
cues are associated with food pellets. When a light signals a food pellet,
it evokes a behavior known as “rearing,” in which the rat stands up on
its hind paws. In contrast, when a tone CS signals a food pellet, rearing
behavior is not evoked. Instead, the rat jerks its head around in a rather
excited manner. Holland called this type of response “head-jerk behavior.”
Thus, the “form” of the conditioned response depends on whether a light
CS or a tone CS has been associated with food.
Now consider what the rat might do during compound light-tone trials
in the feature-positive discrimination. If the rat learns the Rescorla-Wagner
way (see Figure 5.11, upper left), there should be a strong light-food asso-
ciation, but no tone-food association because the light has blocked learn-
ing about the tone. Because the tone has been blocked, no responding is
expected on trials when the tone is presented alone. But when the light is
added on the light-tone compound trials, responding will occur because
the light has been directly associated with food. Now, here is the important
idea: Because a light associated with food elicits rearing (as was just noted),
we should therefore expect the rat to rear to the light-tone compound. This
makes sense if the rat has merely associated the light CS with food, the
Rescorla-Wagner way. But if the rat has learned the occasion-setting way
(see Figure 5.11, upper right), the light should cause a different behavior.
Here, it is supposed to modulate the tone’s own association with food. Be-
cause the tone’s association with food causes head jerking, the light should
allow the tone to evoke the head-jerk response. The rat should therefore jerk
its head (indicating tone-food association), not rear (indicating light-food
association), when it is presented with the light-tone compound. By looking
at the form of the response on the compound trials, we can determine if
the rat has learned the occasion-setting way or the Rescorla-Wagner way.
Ross and Holland (1981) found evidence of occasion setting; that is,
the rat jerked its head, rather than reared, in the presence of the light-tone
182  Chapter 5

Figure 5.12  Responding to the light-tone com- 100


bination after serial (left) or simultaneous (right) Response
feature-positive training. In the serial group, the Rearing
responding was actually to the tone alone after 80 Head-jerk
presentation of the light. The serial procedure

Percentage of total behavior


caused occasion setting: The light turned on
head-jerk responding to the tone. The simulta-
60
neous procedure, in contrast, merely allowed
the light to be associated with the US (indicated
by rearing). (Data from Ross & Holland, 1981;
figure after Holland, 1992.) 40

20

0
Group serial Group simultaneous

compound (Figure 5.12, left side). Thus, the light was basically turning
on the tone’s control of behavior. Interestingly, Ross and Holland found
this result when they used a “serial” compound conditioning procedure.
In a serial procedure, the light and tone are presented in a series: The
light is presented (and then turned off) before the tone is presented on the
compound trials. With the serial procedure, the rats learned the occasion-
setting way: The light modulated head-jerking behavior to the sound of
the tone. In contrast, after a more traditional “simultaneous” procedure in
which the light and tone went on and off at the same time whenever they
occurred together, the rats appeared to learn the Rescorla-Wagner way;
that is, the animals learned a simple light-food association. They reared
(and did not jerk their heads) during the compound stimulus (see Figure
5.12, right side). The serial feature-positive discrimination led the light to
activate the tone-food association (occasion setting), whereas the simulta-
neous procedure led to simple light-food learning. The difference is subtle
and mind-boggling. (I will explain this phenomenon in the section below
entitled, “What is learned in occasion setting?”)
Rescorla (1985) soon reported similar results with pigeons in autoshap-
ing. As you already know, in this method, the pigeon pecks at a key on the
chamber wall when illumination of the key (key light) is associated with
food. In Rescorla’s experiments, the key light was the target CS, and a dif-
fuse noise was the feature in a feature-positive discrimination. The noise
was audible for 15 seconds, and the key light came on during the final 5
seconds of the noise. On these trials, the key light was paired with food;
on other trials, there was no noise, and the key light occurred without
food. The noise itself did not elicit any key pecking—that behavior was
only elicited by the key light—but the noise allowed the key light to elicit
Whatever Happened to Behavior Anyway?   183

pecking when the noise came on with the light. Once again, a feature al-
lowed a target CS to control responding. Somehow, during feature-positive
training, the noise came to set the occasion for pecking at the key light.
In the long run, a careful analysis of the form of conditioned response
thus indicates that the feature is not controlling behavior through a direct
association with food. Instead, it modulates the target’s own association.
This kind of result, on response form, provides the first crucial line of
evidence of occasion setting. At least two other results describe special
properties of occasion setters, however. As before, they both suggest that
a feature in a feature-positive or a feature-negative discrimination might
not influence behavior through its direct association with the US.
The second line of evidence is that an occasion setter will still modulate
responding to the target if we modify its direct association with the US. For
example, if a feature from a feature-positive discrimination were merely
being associated with food, presenting it repeatedly alone (extinction) should
reduce its influence. However, such extinction of a positive occasion set-
ter does not eliminate its impact (e.g., Rescorla, 1986). Analogous results
with negative occasion setters can be even stranger. In a feature-negative
discrimination, the negative feature turns off responding to the target. So,
what should happen if we were to pair the negative feature with a US and
turn it into an excitor—a CS with a direct and positive association with the
US? A simple inhibitor’s power to turn off responding to a target should be
abolished, and that appears to be true (Holland, 1984). But if the feature is a
negative occasion setter, it is still able to inhibit responding to the target, even
after we associated it directly with the US (Holland, 1984)! An occasion set-
ter’s power thus seems quite separate from its direct association with the US.
The third line of evidence for occasion setting is as follows. We have
already seen that excitors and inhibitors usually summate when they are
combined. That is, when an inhibitor is combined with an excitor, it reduces
performance to the excitor, and when an excitor is combined with another
excitor, it increases performance to it (see Figure 3.11). Things do not work
this way in the world of occasion setting (e.g., Holland, 1986, 1989b). Spe-
cifically, if we test an occasion setter’s effect on a new CS, it typically does
not influence responding to the new CS at all. (A “new CS” means a CS
that is different from the target that was in the original feature-positive or
feature-negative discrimination.) There is an important exception, though.
If the new CS has been a target in a separate feature-positive or feature-
negative discrimination, the occasion setter will influence responding to the
new CS (e.g., Lamarre & Holland, 1987; Rescorla, 1985). There is something
rather special about what is learned about both the feature and the target
in an occasion-setting discrimination. This issue will be considered further
after we first pause for a breath and review what we have just learned.

What does it all mean?


To summarize, occasion setters are features from serial feature-positive
and feature-negative discriminations that allow animals to discriminate
184  Chapter 5

positive from negative trials. But they do so in a way that is not predicted
by conditioning theories. Occasion setters have at least three unexpected
properties: (1) They modulate the behavior that is otherwise evoked by the
target, (2) they are not affected much by changing their direct associations
with the US, and (3) they do not always affect responding to new stimuli.
These differences are important because they seem to lie outside the
scope of the conditioning theories we discussed in Chapter 4; they tell us
that stimuli can do more than enter into simple associations. Thus, there is
more to the psychology of learning than simple associative strength. The
differences also have practical implications. For example, occasion setting
may be involved outside the laboratory, wherever conditioning occurs.
Consider drug addiction: Heroin abusers no doubt encounter many differ-
ent cues that are potentially associated with their drug of choice (Siegel &
Ramos, 2002). It is possible that the main CS for the drug is something like
the proximate cue of the needle piercing the flesh. More remote cues that
precede the prick of the needle—like room cues or the sight of the needle
and paraphernalia—might function as occasion setters that set the occasion
for the CS-drug relationship. If we wanted to eliminate an abuser’s habit,
we might try to extinguish the room cues or the paraphernalia cues by
presenting them over and over without the drug. If the cues are occasion
setters, however, that treatment may be ineffective because, as we have
just seen, occasion setters are not affected by simple extinction (Rescorla,
1986). (See Conklin and Tiffany, 2002, for evidence that simple extinction
exposure to drug cues may indeed not be very effective at reducing drug
taking.) Similar arguments apply to the treatment of phobias and anxiety
disorders (e.g., see Bouton, 1988). Fear of a CS may be a problem in one
particular context; therefore, a natural approach would be to extinguish the
context. However, if the context acts as an occasion setter, simple extinc-
tion will not eliminate its effect (e.g., Bouton & Swartzentruber, 1986). For
practical as well as theoretical reasons, it would be good to understand
exactly how occasion setting works.

What is learned in occasion setting?


One way to address this question is to ask what the organism actually
learns in occasion setting. One clue comes from the procedures that produce
it. As noted above, occasion setting is mainly learned in “serial” discrimi-
nations in which the feature is presented before the target. (Simultane-
ous discriminations create features that have the properties expected of
ordinary excitors or inhibitors.) In a serial procedure, the target provides
information about “when” the US is going to happen; in contrast, the occa-
sion setter uniquely signals “whether or not” the US is going to follow the
target (e.g., Bouton, 1997; see also Holland, 1992). Maybe that is the major
difference: Occasion setters might signal “whether” the US will happen,
and the target might signal “when” it will occur.
This rule of thumb is not a bad one, but it does not quite get to the heart
of the matter. In the serial discrimination procedure, the extra interval of
Whatever Happened to Behavior Anyway?   185

time between the feature coming on and the US being delivered probably
also makes the feature a weaker signal for the US. Remember that con-
ditioning is usually weaker the longer the interval between CS and US.
Perhaps a weak feature-US association is what somehow allows occasion
setting. Consistent with this idea, Holland (1989a) found that occasion
setting can actually be learned in simultaneous procedures if the feature is
less salient than the target. When a dim light feature was combined with a
very loud tone target, for example, occasion setting developed. Thus, the
bottom line may be that occasion setting will occur when a weak feature signals
reinforcement or nonreinforcement of a stronger target.
Rescorla (1988a) suggested why things might work this way. When
the target is strong, it will quickly develop a strong association with the
US because it competes more effectively with the feature for conditioning;
that is, it gets a bigger boost in associative strength every time the com-
pound is paired with the US. Therefore, on trials when the target occurs
without the US, there is a bigger surprise and a bigger inhibitory adjust-
ment. Strangely enough, if you remember what was previously said about
extinction (e.g., see Figure 5.6), this adjustment might mean
that the strong target would wind up with more inhibition
as well as more excitation. This inhibition may play a key Positive occasion setting
role in that occasion setting appears to develop when the
US
procedure allows a lot of inhibition to develop to the target
(see Rescorla, 1988a). The occasion setter might also work by
somehow modulating that inhibition (e.g., Swartzentruber
& Rescorla, 1994). T L
To capture this idea, Figure 5.13 assumes that the tar-
get CS (T) in a feature-positive and feature-negative dis-
Negative occasion setting
crimination gains both an excitatory and an inhibitory as-
sociation with the US (Bouton & Nelson, 1998a). It is like US
an extinguished CS (see Figure 5.6) in that excitatory and
inhibitory associations are gradually learned when the tar-
get is paired with the US and presented without the US,
T
respectively. As Figure 5.13 illustrates, the occasion setter L
(L) now works by modulating the target’s inhibitory associa-
tion (Bouton & Nelson, 1998a; Swartzentruber & Rescorla, Figure 5.13  Occasion setters
1994; see also Schmajuk, Lamoureux, & Holland, 1998). This modulate inhibition to the
means (somewhat strangely!) that a positive occasion setter target CS. During feature-
actually inhibits the target’s inhibition, whereas a negative positive and feature-negative
occasion setter excites the target’s inhibition. discrimination learning, the
target CS (T) acquires both
You might notice that the negative occasion setter is basi-
excitation and inhibition, as
cally working the same way the context works in extinction we saw in extinction (e.g.,
(see Figure 5.6)—both the occasion setter and the context see Figure 5.6). The occa-
activate a CS’s inhibitory association. The parallel is not sion setter (L) works by either
surprising, because the context in extinction may function inhibiting or activating the
as a negative occasion setter (e.g., Bouton, 2004). Bouton target’s inhibitory association
and Nelson (1998a) have further suggested that activation with the US.
186  Chapter 5

of the target’s inhibitory association—like activation of an extinguished


CS’s inhibitory association—might also depend on input from the context.
Thus, in negative occasion setting, the target CS, the occasion setter, and
the context might all need to be present to activate the final inhibitory link
(see Bouton & Nelson, 1998a, for the evidence). The main point is that oc-
casion setters seem to operate by modulating the target CS’s inhibition.
This begins to explain why occasion setters only influence new target CSs
that have been in an occasion-setting discrimination (see the bottom of p.
165). Presumably, those CSs also have an inhibitory association on which
the occasion setter can have its effect.

Configural conditioning
Another idea about what is learned in occasion setting is that the feature-
target compound creates a configural cue that acquires associative strength
(e.g., Kehoe & Gormezano, 1980; Pearce, 1987, 1994; Wagner, 2003; Wagner
& Brandon, 2001; Woodbury, 1943). This idea was introduced in Chapter 4:
When two or more CSs are combined, they can create a new stimulus—a so-
called configural cue—that becomes associated with the US (see Figure 4.17
for an illustration). In this view, when feature and target cues are combined
in feature-positive or feature-negative discriminations, they would create
a unique stimulus that would gain excitation or inhibition and control
conditioned responding accordingly. If things worked this way, it would
not be surprising to find that extinguishing the feature cue on its own, or
perhaps pairing it with the US, might do little to change responding to the
feature-target compound: We have not done anything to the configural
cue! You may recognize this as property number 2 of occasion setting (see
above). The point is that occasion setting might somehow boil down to the
conditioning of configural cues.
Does this sort of approach handle all the facts of occasion setting? It does
quite well with many of them (see Holland, 1992). But the clearest configural
conditioning model (Pearce’s model, introduced in Chapter 4) makes incor-
rect predictions in a number of places that would take far too much space to
describe here (see Holland, 1992; see also Bonardi & Jennings, 2009; Bouton
& Nelson, 1994, 1998a). Most research on occasion setting has taken an el-
emental, rather than a configural, approach. But as research on the problem
continues in the future, configural accounts will need to be kept in mind.

Other forms of modulation


Occasion setting is not the only example of performance to one stimulus
being modulated by another stimulus. Ordinary CSs can also exaggerate
or potentiate the strength of URs or CRs that are elicited by other stimuli.
For example, rats startle more vigorously to a sudden burst of noise if the
noise burst is presented during a fear CS (e.g., Brown, Kalish, & Farber,
1951). Conditioned fear elicited by the CS “potentiates” the startle response
(see Davis, 1992, for a review), a phenomenon known as fear potentiated
startle. Similar effects occur in humans (e.g., Lang, 1995; see also Kindt et
Whatever Happened to Behavior Anyway?   187

al., 2009, as discussed on p. 23). I am reminded of the moment of panic I once


felt when I heard a sudden rattle while hiking in rattlesnake country. It was
only the sound of some fishing tackle breaking loose in a plastic box that I
was carrying. Fear, apprehension, or anxiety can potentiate defensive reflexes
elicited by other stimuli. Fear of a CS thus modulates the startle response.
Potentiated startle effects are addressed in Wagner and Brandon’s (1989)
AESOP theory (see Chapter 4). Remember that their “affective extension”
of SOP theory argued that CSs enter into associations with both emotional
and sensory aspects of the US. The emotive association elicits emotional
responses directly when it is activated by a CS, but it can also modulate
other responses. Brandon, Wagner, and their colleagues have studied such
effects in rabbits. They conditioned fear to a 30-second CS by pairing it
with an electric shock US delivered near the rabbit’s eye. Brief, 1-second
CSs will elicit an eyeblink CR when they are paired with this US, but the
30-second stimulus does not; it only elicits fear or anxiety. The conditioned
fear stimulus nonetheless will potentiate an eyeblink UR to the shock near
the eye or a startle response elicited by an air puff to the ear (e.g., Brandon,
Bombace, Falls, & Wagner, 1991; McNish, Betts, Brandon, & Wagner, 1997).
It will also potentiate an eyeblink CR that is elicited by a second brief CS
(e.g., Brandon & Wagner, 1991). This form of modulation is different from
occasion setting because it does appear to result from a direct association
between the modulating CS and the US.

What does it all mean?


Modulation processes like the ones we have been discussing are probably
fairly common in conditioning. Thus, the performance one observes to a
CS always depends to some extent on other stimuli in the background. Oc-
casion setting and other types of modulation make good functional sense:
They presumably allow cues that are somewhat remote in time from the
US to control responses that are timed quite well to deal with the US, and
they are important in determining the performance that results in Pavlov-
ian conditioning.

Understanding the Nature of the Conditioned


Response
So far, we have been worried about how a learned association is translated
into behavior, but we have said little about the actual nature or form of the
behavior itself that results. What does the organism do when the CS-US
association has been properly modulated or retrieved? It is interesting to
observe that this issue has become a fundamental problem in Learning
Theory only relatively recently. In earlier years, when learning was as-
sumed to involve connecting the CS with the UR, it was very clear what
the behavior should be that resulted from conditioning. It should be the
same as the unconditional response to the US (see Chapter 3). Even in the
stimulus substitution view, which held that the fundamental “content” of
188  Chapter 5

learning was S-S rather than S-R, the nature of the response was also clear.
If the CS merely came to substitute for the US as a kind of surrogate, the
response to the CS after conditioning should also be the same as the UR.

Two problems for stimulus substitution


In the 1970s, two kinds of findings came along that raised big questions
about stimulus substitution. First, researchers discovered that the form of
the CR is not just determined by the UR; rather, the CR is also influenced by
the nature of the CS. Remember Holland’s (1977) peculiar finding that was
used in occasion-setting research and that I promised to return to: Rats jerk
their heads to auditory CSs associated with food pellets, but they rear to
visual CSs associated with the same unconditioned stimulus and therefore
the same unconditioned response.
Timberlake and Grant (1975) reported equally striking results. They
used the presentation of a rat to signal a food pellet to other rats. A rat
was gently fastened to a platform that could swivel through a door and
into the conditioning chamber. Presentation of the rat this way was paired
with food pellets over a series of conditioning trials. What kind of condi-
tioned response do you think was learned by the subject rats? Salivation,
or perhaps a little gnawing on the CS rat? Not at all. When presentation of
the CS rat was followed by a food pellet, the subject rats began engaging
in social contact with it. They approached, licked, groomed, and crawled
over the CS rat. Subjects that received a CS rat unpaired with food showed
little of these behaviors; the contact behaviors thus depended on associat-
ing the CS rat with food. And subjects that received conditioning with a
wood block CS (mounted on the platform instead of a rat) did not show
social contact behaviors either. What seemed to be occurring was an in-
crease in food-related social behaviors directed at the CS. Rats naturally
tend to feed with groups of other rats; they live in colonies and take their
meals socially. Hamsters, which are more solitary, do not show the same
social effects when run in the same type of experiment (Timberlake, 1983).
Stimuli that naturally support social behavior increasingly elicit contact
when they are paired with food. Something about the nature of the CS
selects US-related behaviors that are appropriate to the CS. This is a long
way from stimulus substitution.
The second kind of finding that challenged stimulus substitution was
discovered when drugs were used as unconditioned stimuli. The charge
was led by Shepard Siegel, whose experiments were discussed in Chapter
2. Remember that injection of morphine causes analgesia, a reduction in sen-
sitivity to pain. When morphine is paired with a CS, though, the CR seems
to be a compensatory response; the CS evokes an increase in sensitivity to pain
(e.g., Siegel, 1975). Other research suggests a similar pattern for the body-
temperature effects of alcohol. Alcohol causes a drop in body temperature
unconditionally, whereas CSs associated with it cause an increase in body
temperature (e.g., Mansfield & Cunningham, 1980). The CR and UR again
seem to be opposites, which is hardly consistent with stimulus substitution.
Whatever Happened to Behavior Anyway?   189

Compensatory responses are not restricted to drug USs. For example,


in Chapter 2, we also saw that a painful shock US can condition an endog-
enous analgesia, which is, once again, an opposite response (e.g., Fanselow &
Baackes, 1982). Similarly, rats exposed to cold temperatures gradually learn
to compensate; their body temperatures show less change with repeated
exposure. Interestingly, this effect is connected with classical conditioning:
When the context is changed, the animal loses its cold tolerance (e.g., Kiss-
inger & Riccio, 1995; Siegel, 2008), perhaps because it loses the compensatory
response. Woods (1991) has noted that similar compensatory processes are
involved in digesting meals. Although food is clearly necessary for survival,
a big meal can actually disturb the body’s equilibrium. Cues that are associ-
ated with meals may therefore elicit conditioned compensatory responses
that get the animal ready to cope with the intake of food.
Siegel himself became interested in a phenomenon in color vision
known as the McCollough effect (McCollough, 1965; e.g., see Siegel &
Allan, 1998). The phenomenon is illustrated in Figure 5.14. Humans can
be shown alternating pictures of a black vertical grid superimposed on
a background of one color (e.g., blue) and a horizontal grid on a back-
ground of another color (e.g., yellow). After multiple exposures of each
grid, a test picture with both the vertical and the horizontal grid on white
can then be shown. Thanks to the earlier exposures, the participant now
sees the opposite color on the two grids (yellow on vertical and blue on
horizontal). What the participant sees are essentially conditioned visual
afterimages. Although other ways of explaining this effect have been pro-
posed (e.g., Dodwell & Humphrey, 1990), a strong case can be made for a
conditioning-like process in which the grids are CSs, the colors are USs,
and the conditioned response is opposite to the original US. For example,
blocking and overshadowing effects can be observed with compounded
vertical and diagonal grids (Siegel, Allan, & Eissenberg, 1994). It all adds
up to something rather important: Conditioning processes seem to be op-
erating everywhere, and compensatory, US-opposite conditioned responses
are not uncommon in human and animal behavior.

Train Test Participant sees

Blue Yellow

“Yellow” “Blue”

Figure 5.14  The McCollough effect. During testing with vertical or horizontal
grids previously associated with blue or yellow, for example, participants see
the opposite color.
190  Chapter 5

These two challenges to stimulus-substitution have stimulated think-


ing about the nature of the conditioned responding in several important
ways. Let us consider the compensatory CR first and then return to the CS
influencing the form of the CR.

Understanding conditioned compensatory responses


One point to make about the compensatory response is that it suggests the
functional nature of the conditioned response. As argued in Chapter 2, the
CR fundamentally serves to allow the animal to adapt to the upcoming
US (Hollis, 1982, 1997). Thus, the CRs that develop during conditioning
might be the ones that are best in helping the organism get ready for the
US. This perspective represents a shift in how we think about the effect
of conditioning on behavior. Now, we see the result of conditioning as a
set of behaviors evoked by the CS that allow the animal to optimize its
interaction with the US. This perspective is rather broad, however, and it
is not very precise about predicting the actual form that the CR will take.
Another approach is to take a harder look at the nature of the uncon-
ditioned response. Consider SOP theory again. Recall that presentation of
a US is supposed to initiate a standard sequence of events. The US node is
first put into its focally active state, A1, and then it decays to its peripher-
ally active state, A2, before it becomes inactive again (see Chapter 4). It
seems possible that these two levels of activation can sometimes control
different responses. If that were the case, we should see a “response” to the
US that is actually a sequence of two responses: one corresponding to A1
and a second corresponding to A2. The important thing is that, according
to SOP theory, the CS will come to activate the US node into A2. Thus, the
response elicited by the CS should be the second response—not the first—in
the sequence initiated by the US.
Paletta and Wagner (1986) noted that a morphine injection first suppress-
es a rat’s general activity (causing sedation), but eventually this suppression
gives way to a rebound effect in which activity is actually elevated above
the baseline. This sequence is shown in Figure 5.15. If these two responses
correspond to A1 and A2, the SOP prediction is clear: CS-morphine pair-
ings should allow the CS to evoke elevated activity when it is tested on its
own. This finding was confirmed in a second experiment (see Figure 5.15,
inset) in which rats received morphine injections in either the test apparatus
(m-e) or the home cage (m-hc) and a control group received saline in the
experimental apparatus (s). In the test shown, the animals were injected
with saline in the test box. When the box was associated with morphine,
hyperactivity—a compensatory response—was observed. But SOP would
see the compensatory response as consistent with, not opposite to, the UR.
The trick is to analyze the UR in more detail and recognize that the CR will
match the second component. In many cases, we will find that A1 and A2
control the same rather than different responses. In these cases we would not
observe a biphasic unconditional reaction like the one shown in Figure 5.15,
and we would not expect a conditioned compensatory response.
Whatever Happened to Behavior Anyway?   191

Figure 5.15  Activity after morphine

counts/minute
Mean activity
Mean activity counts (5-minute periods) Morphine 40 or saline injections. After morphine,
160
Saline
20
activity first decreased (A1) and
then increased (A2) above normal.
120 0 When morphine was associated
m-e m-hc s
Group
with a unique box or environment
(m-e in the inset), the box elicited
80 an increase in activity— as if it was
activating the morphine US’s A2
response. (Data from Paletta &
40 Wagner, 1986; figure after Wagner
& Brandon, 1989.)

0
–30 0 30 60 120 240 480 960 1440
Time pre- or post-injection (min)

In a very
Au: Would important
it be helpful paper,
to label A1 and A2Eikelboom
on this graph?and
Stewart (1982) described
another way to think about the UR. Their approach was physiological; we
must recognize that presenting a US causes a whole multitude of physi-
ological reactions or responses. Eikelboom and Stewart noted that for a
“response” to be a UR in the usual sense, the response must be mediated by
the central nervous system; that is, it must be caused by neural activity in the
brain or the spinal cord. This seems rather obvious, in a way. As shown in
the top part of Figure 5.16, classic responses to typical USs used in condi-

A real “response”

Nervous
system

US UR

A compensatory
UR
Nervous
system

Drug Peripheral UR Figure 5.16  Making sense of compensatory condi-


effect tioned responses. A real “response” to a US is the
reaction of the nervous system (top). Sometimes,
however, a drug US can have a peripheral effect that
A compensatory stimulates the nervous system to compensate (middle).
UR to insulin The peripheral effect can be mistaken for a UR, but it
Nervous
system actually stimulates the nervous system to compensate,
which is the true UR. This phenomenon is illustrated by
Insulin Decrease in Increase in the effect of insulin (bottom). When the UR is compen-
blood glucose blood glucose satory, so is the CR.
192  Chapter 5

tioning experiments are always mediated by the nervous system. To put it


a little loosely, the US provides input to the brain, and the brain reacts to it.
Eikelboom and Stewart (1982) noted that the definition of the UR gets
confused when we consider the effects of drug USs. Drugs can produce
many “responses” that we can measure. Some are mediated by the central
nervous system in the classic way described at the top of Figure 5.16, but
others are not. For example, consider the effects of injecting insulin. (Insulin
is a hormone, not a drug, but it illustrates the point very clearly.) The effect
of insulin that is usually measured (Siegel, 1972) is a drop in the level of
blood glucose. This effect (at least when the insulin dose is small) occurs
because the hormone promotes glucose uptake into the cells (and out of
the blood). It is a direct effect; the brain has nothing to do with it. It is not
really an unconditioned response in the same sense of Pavlov’s original
UR. We might therefore expect it to be something different.
The drop in blood glucose is actually a new stimulus that the nervous
system now reacts to. A sudden drop in blood glucose can be dangerous;
it is detected by the brain, which now responds in a way that is designed
to cope and adapt. Here, then, is where the compensatory response comes
from: It is the brain’s unconditional response to a peripheral input, a drop
in blood glucose. It is a UR. Seen this way, the compensatory CR that devel-
ops over conditioning trials is just like any other CR. It does resemble the
UR, properly defined, and stimulus substitution is preserved. As you have
probably guessed, conditioning experiments with small doses of insulin
tend to lead to the compensatory conditioning of elevated blood glucose
(Siegel, 1972; Woods & Shogren, 1972).
This approach recognizes that compensatory responses are not the only
outcome of conditioning with drug USs. When those responses are caused
by neural activity, the CR will look the same as the UR. When the “re-
sponses” are direct peripheral effects, they stimulate a compensation by the
nervous system, and we observe something that looks like a drug-opposite
effect. The Eikelboom-Stewart perspective also allows some drugs to show
conditioned sensitization effects; that is, conditioning may sometimes cause
an increase in a drug-like response to the CS. This might cause a response
when the CS is tested alone, or it might cause an enhanced response to the
US when the CS is tested in the presence of the CS. Research has identified
a number of conditioned sensitization effects with drugs (e.g., Stewart,
1992; see also Robinson & Berridge, 1993).
Furthermore, drugs may have multiple effects on the body, some of
which may be nervous-system mediated and others of which may not be.
The approach predicts that a single drug may condition responses that both
mimic and compensate for URs. For example, injections of atropine cause
both pupil dilation and a drying of the mouth. The first effect is mediated
by the brain, whereas the second effect is not because atropine suppresses
activity at the salivary glands, which presumably initiates a brain reac-
tion designed to compensate. True to the Eikelboom-Stewart perspective,
Whatever Happened to Behavior Anyway?   193

pupil dilatation and salivation can be conditioned at the same time (Korol,
Sletten, & Brown, 1966). The first looks like normal conditioning, whereas
the second is a compensatory response. It all makes sense when one consid-
ers how the drug actually interacts with the nervous system.
Ramsay and Woods (1997) noted that it is idealistic to think that one
can always identify a simple CR or UR in a drug-conditioning experiment.
Drugs have so many effects that any given “response” that we measure
(blood glucose level, body temperature, etc.) is actually bound to be a com-
plex product of a very large number of ongoing physiological processes.
It may therefore be impossible to identify the crucial UR and CR. It may
also be difficult to know what the crucial “stimulus” is that the nervous
system actually detects and reacts to. Given the enormous complexity in-
volved, the best one may do to predict the direction that the CR will take
is to look carefully at the body’s response on a drug’s first presentation. If
the body seems to react by compensating so that the drug’s effect seems
to weaken as the drug concentration in the system is increasing (so-called
acute tolerance), compensatory effects will be conditioned. On the other
hand, if the effect of the drug is still increasing as the drug’s concentration
in the system begins to subside (so-called acute sensitization), conditioned
sensitization effects may be observed.
A complete understanding of the effects of conditioning with drug USs
will require a sophisticated understanding of many physiological process-
es. But notice that a complete understanding of how physiological process-
es maintain the body’s equilibrium may likewise require a sophisticated
understanding of conditioning. Several writers have noted that learning
processes like those in Pavlovian conditioning probably play a widespread
and important role in helping the body regulate equilibrium or homeostasis
(e.g., Dworkin, 1993; Ramsay & Woods, 1997, 2014; Siegel, 2008).

Conditioning and behavior systems


Other sorts of biological processes—in this case, ethological ones—have
been used to help understand why the form of the CR depends on the
nature of the CS (e.g., Timberlake & Grant, 1975). Conditioning is now
thought to engage whole behavior systems, sets of behaviors that are
organized around biological functions and goals (e.g., Timberlake, 1994,
2001; Timberlake & Silva, 1995). Encountering food USs during condi-
tioning makes a whole system of feeding behaviors possible. These might
include behaviors such as foraging, chasing, hoarding, and food-handling.
Sexual USs likewise engage a sexual behavior system (which might include
various search behaviors as well as copulation responses), frightening USs
may engage a defensive behavior system, and so on. In each case, animals
have ways of responding to particular USs that have developed through
evolution, prior learning, or both. Ultimately, presentation of a US in a
conditioning experiment will enable a whole potential system of such be-
haviors, just as it would in the organism’s natural world.
194  Chapter 5

Once a behavior system is engaged, stimuli in the environment provide


“support” (Tolman, 1932) for particular behaviors in the same sense that a
hallway makes it possible to walk and a swimming pool makes it possible
to swim. A CS may determine the CR in this way. Thus, a small moving
object that signals food might support and elicit chase behaviors, whereas
a diffuse noise might mainly activate search behaviors. You have already
been given a glimpse of the behavior-system account of Timberlake and
Grant’s (1975) discovery that rats, used as signals for food, support social
responding. Because the rat is a social feeder, rats come to display certain
social behaviors toward one another when they are hungry or are expect-
ing food. When a rat signals food, the feeding system is engaged, and the
signal thus evokes the corresponding social contact behaviors. There are
presumably no such behaviors in the hamster’s solitary feeding system;
consequently, hamster CSs do not evoke social behavior (Timberlake, 1983).
William Timberlake, one of the main theorists in this area, has described
a feeding behavior system in the rat that is illustrated in Figure 5.17 (e.g.,
Timberlake, 1983, 1994, 2001; Timberlake & Silva, 1995). You can think
of the system as an exhaustive set of behaviors available to the rat that
generally have the function of finding and consuming food. Like earlier
models from ethology (e.g., Baerends, 1976; Tinbergen, 1951), the system
is hierarchically organized so that higher functions have a number of more
specific behaviors and functions that subserve them. At the highest level,
William
the system has “modes” that correspond to the system’s biggest functions
Timberlake
or goals: the search for—and consumption of—food. Within these modes,

Subsystem Mode Module Behavior

Travel Locomote
Scan
General Socialize Crawl over
search Sniff
Investigate Nose
Paw
Chase Track
Cut off
Lie in wait Stay still
Predatory Focal Pounce
search Capture Grab
Bite
Test Gnaw
Hold
Ingest Chew
Figure 5.17  A feeding (or predatory) Handle/ Swallow
behavior system. Notice the hierarchical consume Reject Spit out
relationship among modes, modules, Wipe off
and behaviors. (After Timberlake, 2001.) Hoard Carry
Whatever Happened to Behavior Anyway?   195

however, are “modules” that correspond to more specific subfunctions;


for example, “general search” has different modules of travel, investigate,
and chase, and each of those has corresponding responses and actions.
Each mode, module, and behavior also has its own support stimuli. For
example, hunger and the onset of nightfall (when rats normally feed) are
stimuli that might support and initiate the general search mode. These
cues would therefore initiate any of the associated modules and actions
organized around the function of finding food. If an animal in the general
search mode were then to encounter a Pavlovian cue for food, the animal
might change to the focal search mode, with new modules and new be-
haviors organized around the function of actually procuring food. Finally,
when food itself is encountered, the animal might switch to a handling/
consuming mode with modules and behaviors organized around the han-
dling and ingesting of food. These types of behaviors can be elicited when
a rolling ball bearing is used to signal food; the rat chases it, handles it,
and chews it (e.g., Timberlake, Wahl, & King, 1982). At any point, behavior
is jointly selected by the current mode and by the environmental stimuli
that support it.
A related system for antipredator defense in the rat has been developed
by Michael Fanselow (e.g., Fanselow, 1989, 1994; Fanselow & Lester, 1988;
see also Bolles & Fanselow, 1980). In this system, behavior is organized
around the function of avoiding being eaten by another animal. In this case,
the stimulus that switches the rat between modes is the likelihood that the
rat will encounter a predator. Fanselow has called this likelihood preda-
tory imminence. When the rat first leaves its nest or burrow, predatory
imminence is probably low, and the rat is not likely to encounter a predator
at this time. At this point, the animal will enter a “pre-encounter” mode in
which it might engage in different behaviors (e.g., nest maintenance, meal
pattern organization) that are organized to prevent detection by a preda-
tor. When the rat actually detects a predator nearby, it enters the “post-
encounter” mode in which it primarily freezes and becomes analgesic (see
Chapter 2); this mode is “fear.” If and when the predator is actually about to
attack, however, the animal enters a “circa-strike” mode; here the behaviors
(fighting, threat display, jump attack) are active and presumably designed
for escape. Different defensive behaviors are thus organized around their
function. Each is elicited by different support stimuli. Interestingly, differ-
ent brain areas in the rat also correspond with and control different modes
(e.g., Fanselow, 1994; Perusini & Fanselow, 2015). Some neuroimaging stud-
ies in humans likewise suggest different brain responses to distant versus
nearby threats (Mobbs et al., 2007, 2009).
Michael Domjan (e.g., 1994, 1997, 1998) has presented a model of a
sexual behavior system in the Japanese quail. Some of this research was
mentioned in Chapter 2. Like the feeding system, the sexual system in-
volves a general search mode, a focal search mode, and a consummatory
mode (in this case, actual copulatory behavior). Initially, an animal might
engage in a general search mode. Interestingly, long-duration CSs that Michael Domjan
196  Chapter 5

are paired with copulation after they have been on for 20 minutes seem
to support such a mode by evoking pacing behavior (Akins, Domjan, &
Gutierrez, 1994; see below). In contrast, conditioning with a localized light
or a stuffed toy dog as a CS engages focal search: The birds approach the
stimulus. Actual copulatory responding, however, does not occur unless
the CS is a model of a real female, prepared by a taxidermist, which con-
tains plumage and other features of a female quail (e.g., Domjan, Huber-
McDonald, & Holloway, 1992).
Interestingly, conditioning at one step in the sequence may modulate
responding in the next step. For example, although a light CS elicits ap-
proach and not copulation, it does make latency to copulate quicker when
the female US is introduced. Similarly, males that have associated the test
context with copulation are more likely to copulate with a model of a female
than are birds that have had equivalent copulatory experience elsewhere
(Domjan, Greene, & North, 1989). Some of the features of the quail’s sexual
system are summarized in Figure 5.18. The figure emphasizes that dif-
ferent types of CSs may engage different modes (and therefore different
CRs) and that CSs (and the modes they evoke) can also potentiate behavior
elicited at the next point in the sequence.
Behavior systems are usually thought to be organized in time—that is,
cues that are remote in time from the US tend to support certain behaviors,
like search behaviors, that are different from behaviors supported by cues
that are more immediate, which often relate to consumption. This idea
led Akins et al. (1994) to a nice prediction in sexual conditioning (see also
Akins, 2000). Although it is common to think that increasing the interval

Copulation = US/UR

Mode Type of CS Type of CR


General search Long, “distal” Pacing
Focal search Short, “proximal” Approach

(Potentiates)

Consummatory Model of a female Copulation

Figure 5.18  Summary of conditioning in the male Japanese quail sexual behav-
ior system (e.g., Domjan, 1997, 1998). Different types of CSs engage different
modes and evoke different CRs. If the bird is in the focal search mode, this also
potentiates the consummatory CR.
Whatever Happened to Behavior Anyway?   197

(A) CS approach (focal search) (B) Pacing (general search) Figure 5.19  Short and long CSs
50 5 that end in the same US will elicit
different CRs. Male Japanese quail
were given conditioning with either
40 4 a 1- or a 20-minute CS. The 1-min-
Time near CS (%)

Crossings/minute
ute CS elicited approach behaviors
30 3 (A). Although the 20-minute CS
elicited very little approach, it
20 2
elicited a great deal of pacing (B),
which the 1-minute CS did not.
(After Akins et al., 1994.)
10 1

0 0
1 minute 20 minutes 1 minute 20 minutes
CS-US interval CS-US interval

between the onset of the CS and the onset of the US (the CS-US interval)
will decrease the strength of conditioning, Akins et al. realized that CSs
with different temporal relations to the US might support different modes
in the sexual system. In one experiment, they compared the effects of 1-
and 20-minute CSs that were paired with copulation with a female. When
approach to the CS was considered, there was less responding with the
20-minute CS (Figure 5.19A). This is the usual effect of lengthening the
CS-US interval. But when they considered the amount of pacing back and
forth in the test cage, the reverse relationship was obtained; there was more
pacing with the longer, 20-minute CS-US interval (Figure 5.19B). If Akins
et al. had only measured approach, they would have concluded that the
longer CS-US interval merely led to weaker conditioning. But more cor-
rectly, it influenced the qualitative nature of the CR: Pacing behavior may
be linked to general search rather than focal search.
Timberlake et al. (1982) also found qualitative changes in how a rat
behaved toward the ball bearing, depending on the CS-US interval. These
results are not consistent with the simpler view, tacitly accepted in previous
chapters, that the amount of time between CS and US merely influences
the strength of conditioning. The behavior systems approach provides an
important complement to our understanding of conditioning.

What does it all mean?


Once again, it is easy to lose the forest for the trees. What does all this
information about conditioning in animals tell us about learning and be-
havior in humans? Perhaps the main point is that behavior systems theory
gives us a useful way to think about the nature of behavior that results
from conditioning. For one thing, it is important to remember that CSs do
not elicit a simple unitary reflex. Instead, they evoke whole systems—or
constellations—of behavior. At a theoretical level, they do so by engaging
198  Chapter 5

pre-organized modules and modes, which are functionally organized to


help the organism deal with (or cope with) the US. Different kinds of CSs
can support different kinds of CRs, depending on both their qualitative
nature and duration. The cultural stereotype of Pavlovian conditioning—
Pavlov’s dog drooling in response to a ringing bell—is really only the tip
of the iceberg.
Some years ago, I had the opportunity to develop this line of thinking
with two clinical psychologists, David Barlow and Susan Mineka (Bouton,
Mineka, & Barlow, 2001). Barlow is a well-known clinical scientist who has
devoted his career to understanding anxiety disorders (e.g., Barlow, 2002).
Mineka is also a respected clinical psychologist, although she was originally
also trained as a researcher in Learning Theory (you might remember some
of her experiments on fear conditioning from Chapter 2). The three of us
were interested in understanding panic disorder, a very common anxiety
disorder in which people experience intense, out-of-the-blue panic attacks
and then come to fear these attacks. For people who develop panic disorder,
the panic attacks become worse and worse, and the fear of having another
one becomes debilitating. Many people with the disorder also develop
agoraphobia—literally a “fear of the marketplace”—in which they become
afraid of leaving the house and going to public places.
Although fear conditioning is widely thought to contribute to anxiety
disorders, there has been much confusion about how it might contribute
to panic disorder (e.g., McNally, 1990, 1994). There can nonetheless be little
doubt that something as emotionally potent as a spontaneous panic attack
can serve as a powerful US (and UR) that can cause fear conditioning.
CSs associated with panic can thus come to arouse fear. However, based
on what we know about the usual organization of behavior systems, we
should probably expect different kinds of modes to come into play. As
summarized in Figure 5.20, we might think that relatively long-duration
or “distal” cues—such as shopping malls or other locations where a panic
attack might eventually occur—provide CSs that support a “preparatory”
mode and CRs that get the system ready for the next panic attack. The CR
elicited by such cues is what we know as “anxiety,” a sense of apprehen-
sion or worry. In contrast, close-up CSs that are proximally associated with

Panic attack = US/UR


Figure 5.20  Summary of a
behavior-system account of panic Mode Type of CS Type of CR
disorder in humans (Bouton, Preparatory “Distal” cues Anxiety
2005; Bouton et al., 2001). CSs of Shopping mall Apprehension,
different durations or temporal worry, etc.
proximity to a panic attack may (Potentiates)
engage different modes and
evoke different CRs (anxiety or Consummatory “Proximal” cues Panic
Pounding heart, Panic attack
panic). Anxiety may also potenti-
dizziness, etc.
ate the panic CR.
Whatever Happened to Behavior Anyway?   199

panic might support a different CR, actual panic conditioned responses,


which are much more intense than anxiety. Unlike anxiety, panic CRs might
be designed to deal with a powerful emotional event that is already in
progress. You can see the clear connection with the “preencounter” and
“circa-strike” behaviors suggested by Fanselow and Lester (1988).
One type of proximal CS often associated with panic is “interoceptive”
(or internal) cues generated by the panic attack itself. That is, patients who
have repeated panic attacks might learn to associate internal cues that are
part of the onset of panic (feeling dizzy or a sudden pounding of the heart,
etc.) with the rest of the panic attack. Conditioning of such early-onset
cues is reasonably well known (e.g., Kim, Siegel, & Patenall, 1999; see also
Goddard, 1999). These cues provide one reason panic attacks get worse
and worse with repeated exposures: Onset cues begin to elicit panic and
thus fan the fire more quickly.
A second reason panic attacks become worse is highlighted in Figure
5.20: Based on what we know about other behavior systems, the prepara-
tory mode elicited by distal cues (anxiety) should be expected to potentiate
panic CRs and URs (e.g., Brandon et al., 1991; Brandon & Wagner, 1991;
Domjan et al., 1989). That is, the presence of conditioned anxiety should
exacerbate panic (e.g., Barlow, 1988, 2002; Basoglu, Marks, & Sengun, 1992;
see also Richter et al., 2012). The parallels between the sexual conditioning
system (see Figure 5.18) and the panic system (see Figure 5.20) are hope-
fully clear (see also Bouton, 2005).
Our conceptualization of panic disorder (Bouton et al., 2001) also used
other information presented in this chapter to help demystify the role of
conditioning. For example, some critics of a conditioning approach have
wondered why a CS like a pounding heart does not elicit panic in the con-
text, say, of athletic exercise or why extinction exposure to the CS without
panic during exercise does not eliminate its tendency to cause panic in
other situations. One answer is that the loss of responding that occurs in
extinction can be expected to be specific to the context where it is learned,
as we saw earlier (e.g., see Figure 5.5). Thus, although panic in response to
the CS of feeling a pounding heart may extinguish in the context of exercise,
extinction in that context will not abolish the CS’s ability to elicit panic in
other contexts, such as in a crowded bus or at a shopping mall. The fact
that excitatory conditioned responses generalize more across contexts than
does extinction may be a reason many behavior disorders are so persistent.
Thus, the trees do make a forest. As you go through this book, I hope
you will see that Learning Theory really does provide a set of powerful tools
for understanding behavior and behavior problems outside the laboratory.

Conclusion
Conditioning influences both physiological and behavioral processes, and
the nature of the CR depends on organized systems that operate at both
of these levels. The physiological responses that will be evoked by a CS
depend on interactions between processes that function overall to maintain
200  Chapter 5

equilibrium. Behavioral responses evoked by the CS depend on behavioral


processes that are organized to help animals cope with motivationally
significant events. In either case, what one measures in a conditioning ex-
periment is now recognized as just the beginning: Pavlov’s dog was doing
much more than just drooling. To understand the form of the responses
evoked by a CS, one must understand the biological function that the sys-
tem might serve in the animal’s natural environment (e.g., Holland, 1984).

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. Conditioning can be remembered very well over time. The forgetting
that occurs can often be alleviated by reminder treatments, which sug-
gests that forgetting does not necessarily mean a permanent loss from
the memory store. Two major causes of forgetting are interference in
which something learned at some other point in time interferes with the
target memory and retrieval failure in which the target memory is not
accessed because the context has changed.
2. Extinction phenomena involve both of these processes. Extinction
itself results from retroactive interference rather than destruction of the
original CS-US association. Extinction performance, however, depends
a great deal on the context for retrieval. When the context is changed
after extinction, extinction is not retrieved, and a recovery of responding
known as the “renewal effect” occurs.
3. The passage of time theoretically causes a change of context. Spon-
taneous recovery is therefore the renewal effect that occurs when the
temporal context changes after extinction. Spontaneous recovery and
the renewal effect can both be alleviated by cues that remind the sub-
ject of extinction. Other paradigms that involve interference may involve
similar retrieval principles. One example is counterconditioning; another
example is latent inhibition.
4. Memories need to be consolidated before they are stabilized in long-
term memory. When a stable memory is reactivated by presenting
relevant retrieval cues, it becomes unstable again and needs to be
reconsolidated. It is possible to impair even a stabilized memory by
interfering with the reconsolidation process.
5. Stimuli can “set the occasion” for a target CS’s association with the US.
Occasion setting often arises in serial feature-positive and feature-nega-
tive discriminations. Occasion setters differ from ordinary CSs in at least
three ways: They influence the behavior that is controlled by the target,
they are not affected by changing their direct associations with the US,
and they do not influence performance to all CSs.
Whatever Happened to Behavior Anyway?   201

6. The target CS in occasion setting has properties that are similar to an


extinguished CS. That is, the target CS seems to have both excitatory
and inhibitory associations with the US that result from reinforcement
and nonreinforcement, respectively. Occasion setters appear to operate
by modulating the target’s inhibitory association with the US.
7. The conditioned response to the CS is not really the same as the uncon-
ditioned response to the US. Sometimes the two appear to be oppo-
sites, with the CR compensating for the UR. The nature of the CS also
influences the form of the CR.
8. “Responses” are things that are generated by the central nervous
system. True URs must therefore be produced by the brain or the
spinal cord. Drug USs often have peripheral physiological effects that
masquerade as a real response. Instead, they are stimuli that cause a
compensatory reaction from the nervous system. This compensatory UR
leads to the conditioning of a compensatory CR.
9. Conditioning enables whole “behavior systems” that are functionally
organized to deal with the US. The behavior that results from classical
conditioning is therefore quite rich and variable. Different types of CSs
may support different components of the behavior system. That is why
both the qualitative nature of the CS and the length of the CS-US inter-
val are important in determining the form of the CR.

Discussion Questions
1. What makes us forget? Is forgetting always bad, or do you suppose that
it can be functional? Why should organisms ever forget?
2. What is the evidence that extinction does not destroy or erase the origi-
nal learning? What are the implications for understanding relapse after
exposure therapy? Given what you know about extinction, how can we
improve clinical treatments that use extinction to eliminate unwanted
responses or behaviors that have developed through conditioning?
3. What is a context? How do contexts influence learning and
remembering?
4. Provide an example of a feature-positive discrimination and a feature-
negative discrimination from real life. Can you explain them in the
Rescorla-Wagner way? Can you explain them in the occasion-setting
way? How would you know or test whether the feature stimulus is work-
ing as a CS or as an occasion setter?
5. Develop a behavior systems perspective analogous to the ones used
to explain sexual conditioning in quail and panic disorder in humans to
help conceptualize appetite and eating behavior in humans. Are there
any implications for understanding how and when humans eat in the
modern world? Are there any implications for understanding why many
of us eat too much?
202  Chapter 5

Key Terms
agoraphobia  198 interference  164 proactive
behavior systems  193 McCollough effect  189 interference  164
configural cue  186 memory rapid reacquisition  171
consolidation  173 reactivation  163 reconsolidation  173
counterconditioning modulate  178 reinstatement  171
171 negative occasion relapse  170
fear potentiated setter  181 renewal effect  166
startle  186 occasion setter  178 retrieval failure  164
feature stimulus  178 panic disorder  198 retroactive
feature-negative positive occasion interference  164
discrimination  178 setter  181 target stimulus  178
feature-positive predatory temporal context  169
discrimination  178 imminence  195 trace decay  163
Chapter Outline
Everything You Know Is Associative Learning in
Wrong 206 Honeybees and Humans  225
Conditioning in bees  225
Special Characteristics of Flavor Category and causal learning in
Aversion Learning  208 humans 228
One-trial learning  208 Some disconnections between
Long-delay learning  209 conditioning and human category and
Learned safety  211 causal learning  233
Hedonic shift  213 Causes, effects, and causal power  237
Compound potentiation  216 Conclusion 241
Conclusion 220
Summary 242
Some Reasons Learning Laws
May Be General  220 Discussion Questions  243
Evolution produces both generality and Key Terms  243
specificity 220
The generality of relative validity  222
chapter

1
6
Are the Laws of
Conditioning General?

W elearning
have been discussing experiments run in animal
laboratories as if these experiments mean
something for the world at large. Our discussion has
accepted what people like Watson and Thorndike first
told us we could assume: The rules of learning that
govern the behavior of a rat or pigeon in a Skinner box
generalize to other examples of animal and human be-
havior. We have been using terms like CS and US (or
R, S, and O) quite abstractly—they are meant to stand
for a very broad range of things and events. Is that
safe? Can we really think of different CSs or different
responses as abstract and more or less interchange-
able? To some extent, you already know we can. The
laws of learning that we have studied so far have been
widely applied to problems in clinical psychology (e.g.,
Mineka & Zinbarg, 2006; O’Donohue, 1998), and in
Chapter 5 we saw some further examples of how prin-
ciples derived from laboratory research on extinction
can predict the effects of exposure treatments in hu-
mans (see pp. 152–157; Vervliet et al., 2013). But this
chapter takes a deeper look at the question. There are
reasons to wonder whether the laws of learning that
we have been discussing so far are always as general
as we have assumed them to be.
206  Chapter 6

Everything You Know Is Wrong


The idea that the laws of learning are general was strongly challenged
during the late 1960s and early 1970s, when several remarkable discover-
ies came along and revolutionized our thinking about the psychology of
learning. This was the period when blocking, contingency effects, relative
validity effects, and so on began a new era of the study of classical condi-
tioning (see Chapters 3 and 4). There were many other things going on at
the time as well. One of the most important was the serious investigation
of taste aversion learning, in which animals learn to reject a flavor when it
is associated with illness. The phenomenon seemed extremely special and
unique. Once taste aversion learning was accepted as a fact that learning
theorists had to deal with, it also fundamentally changed the way that they
thought about learning. To me, the story is fascinating in part because it
was mixed up in the big changes that were occurring in society during the
1960s and 1970s: That is when the United States was deeply involved in a
highly protested war in Vietnam, the civil rights movement and the Great
Society were under way, and the sexual revolution was in progress. Many
accepted truths were being rejected and overturned. To quote a comedy
group that was active at the time (the Firesign Theater), it was as if “every-
thing you know is wrong.”
Taste aversion learning has been mentioned in every one of the preced-
ing chapters. The reason is simple: It has provided many insights into learn-
ing and behavior. But what makes it so crucial in the history of learning
theory? Reports of taste aversion learning date back at least to the 1950s
(Garcia, Kimledorf, & Koelling, 1955), but when John Garcia first forced
it upon the scientific community in the 1960s in several important papers
(e.g., Garcia, Ervin, & Koelling, 1966; Garcia & Koelling, 1966), two things
seemed almost unbelievable about it. First, as we saw in Chapter 2, rats
appeared to associate the flavor with illness over extraordinarily long in-
tervals of time—even several hours (see Chapter 3). This result, known as
long-delay learning, was completely unheard of in earlier work in animal
learning. In fact, some very rigorous experiments with rats in runways had
indicated that learning was extremely difficult if the interval between the
response and reward delivery was longer than a few seconds (e.g., Grice,
1948; see Renner, 1964, Tarpy & Sawabini, 1974, Boakes & Costa, 2014, for
reviews). Consistent with this result, eyeblink conditioning shows optimal
learning when the trace interval between CS and US is less than 1 second
or so. Viewed in this context, the fact that a rat could associate a flavor
and a bout of illness that followed it by several hours was nothing short
of stunning. It suggested a qualitative difference between flavor aversion
learning and other types of associative learning.
The second thing about taste aversion learning that rocked the boat was
that some of the earliest reports suggested that only certain combinations
of CS and US could be learned. Flavors paired with illness worked great,
but when flavor was paired with an electric shock to the foot, or when
Are the Laws of Conditioning General?   207

an audiovisual cue was paired with illness, there was little evidence of
learning (Garcia & Koelling, 1966; see pp. 66–69). As I pointed out before,
it was not that the flavor was an unusually salient CS or that the illness
was an unusually potent US. Rather, something about the combination of
the two was crucial in determining learning. This possibility was also very
radical because it made it difficult to accept the assumption that all stimuli
and responses were inherently equal and equally associable. Flavor and
illness are special things to a rat. It became more difficult to think of them
as examples of abstract classes of events, CS and US.
The idea quickly emerged that learning mechanisms have evolved to
deal with specific problems that animals face in the wild. With taste aver-
sion learning, laboratory science had finally stumbled upon evolution’s
solution to the rat’s problem of learning to avoid foods that contain slow-
acting poisons. This finding was mentioned in an earlier chapter, but what
was not mentioned was the implication that created a shock wave through
the community of scientific psychologists. If evolution can generate such
uniqueness in a learning mechanism, why should there ever be any gen-
erality? Have all learning mechanisms evolved independently to solve
every one of an animal’s functional problems? Why should the principles
of learning uncovered in rats that are run in Skinner boxes with arbitrary
stimuli and responses ever generalize to other situations? I hope you can
appreciate the importance of the idea. The methods used in learning theory
were based on their actual arbitrariness. Studying rats in Skinner boxes
with the thought that doing so was relevant to the human condition is
based on the idea that learning is a general process. But once there was this
strange, new thing called taste aversion learning, why should we expect
generality at all?
Before we go any further with this idea, it is important to remember that
taste aversion learning does occur in humans (e.g., Bernstein, 1978). The
question is not really whether learning principles generalize from species
to species (many animals show taste aversion learning), but whether the
principles generalize from one example of learning, or learning prepara-
tion, to another. Are the laws that describe taste aversion learning the same
or different from the laws that work to explain fear conditioning, eyeblink
conditioning, or autoshaping?
One illustration of the challenge was Martin Seligman’s 1970 paper,
“On the generality of the laws of learning.” In it, Seligman surveyed some
of the research coming out of this period, which suggested that certain
forms of learning were special and that not all Ss and Os, or Rs and Os,
were equally associable. Seligman proposed that some examples of learn-
ing are evolutionarily “prepared,” some are “unprepared,” and others are
“contraprepared.” For example, rats are prepared by evolution to associ-
ate tastes with illness, but they are contraprepared to associate taste with
an electric footshock. One of the most important things that distinguishes
these types of learning was supposed to be how quickly they are learned;
prepared things take one or two trials, whereas contraprepared things Martin Seligman
208  Chapter 6

require a very large number of trials. The main argument, however, was
that although the laws of learning might generalize between examples from
within each category (from prepared to prepared, unprepared to unpre-
pared, and contraprepared to contraprepared), there is no reason to expect
them to generalize between the categories. The laws of learning might not
be as general as the founders of the field had supposed. Everything you
know is wrong.
To some researchers, Seligman’s vision of evolution’s effect on the learn-
ing mechanism had a ring of implausibility to it. Why should there be any
generality even among different examples of “prepared” learning? If pre-
pared learning mechanisms have evolved to handle specific problems—
which was implicit in the argument—why should there be any generality
between types of learning that merely appear to be learned at comparable
rates? Another version of the specialization idea was provided by Rozin and
Kalat (1971), who argued that learning mechanisms might be specifically
adapted to solving particular problems. Taste aversion learning is just one
illustration. We might find that many—even all—mechanisms of learning
are the result of specific adaptations designed to handle specific problems.
This is the version of the challenge that has been passed down to us today.
Thus, at the beginning of the 1970s, there were disturbing questions
about the very foundation of the scientific study of animal learning. No-
tice, however, that although articles like Seligman’s and Rozin and Kalat’s
raised a very fundamental issue, they could not claim to have settled the
problem. Instead, they really raised an empirical question: How general
are the laws that we have discussed in the previous chapters? This chapter
addresses that question. We will first take a harder look at taste aversion
learning and ask how special the apparently unique features of taste aver-
sion learning really are. We will see that there is some generality, and we
will have occasion to wonder why that is. Then we will consider learning
in organisms with brains that have evolved independently of the mam-
malian brain—honeybees—and close by examining associative learning in
humans. Throughout this chapter, the key points are that there is actually a
considerable amount of generality to the principles of learning and that the
investigation of each new possible exception has mainly served to improve
and expand the general laws (e.g., Domjan, 1983).

Special Characteristics of Flavor Aversion Learning


Several things about taste aversion learning have been considered spe-
cial. The question is whether the unique characteristics actually indicate
that taste aversion learning is qualitatively different from other forms of
conditioning.
One-trial learning
It is a remarkable fact that a potent flavor aversion can be learned in one
conditioning trial. If you are one of the many people who have a flavor
Are the Laws of Conditioning General?   209

14 Figure 6.1  One-trial flavor aversion learning.


Extinction Test Shown is the consumption of a saccharin solution
12 First drink 1 day over several trials. The rats’ first drink of sac-
Saccharin consumption

charin was followed by an injection of lithium


10
chloride (Cond.), which caused a strong aversion
8 that was detected when the saccharin was of-
28 days fered on the next trial (E1). Over three daily trials
6 without lithium (E1–E3), however, the aversion
weakened through extinction. After a wait of 28
4
Group 1 days, there was a modest spontaneous recovery
Group 2 (blue circles) compared with a control group that
2
received testing (T) one day—rather than 28
0 days—after the third extinction trial. (After Rosas
Cond. E1 E2 E3 T1 T2 T3
& Bouton, 1996.)
Trials

aversion, you know what I mean; a tequila aversion, for example, can
be learned after a single bad experience. Figure 6.1 shows the results of
a fairly typical experiment using rats in which a novel saccharin drink
was associated with an immediate injection of lithium chloride (Rosas &
Bouton, 1996). The two groups tested averaged nearly zero consumption
of the saccharin after just one saccharin-lithium pairing. But notice that
consumption increased reasonably quickly when the saccharin was offered
again without lithium. This result is an example of extinction. Notice also
that when Group 2 had a delay of 28 days between the third extinction
trial and a test, there was a modest spontaneous recovery of the aversion.
These results may help put flavor aversion learning into some perspec-
tive for you; conditioning is fast, but extinction and spontaneous recovery
occur in a way that is not altogether different from more “typical” forms
of classical conditioning.
Impressive as it is, can we say that one-trial learning is really unique to
flavor aversion learning? Strong conditioned fear can be created after just a
single conditioning trial (e.g., Mahoney & Ayres, 1976), and, of course, some
learning probably occurs in all conditioning preparations as a result of the
first CS-US pairing. According to theories of conditioning, the size of the
increase depends on factors like the salience of the CS, the salience of the
US, and so forth. One could easily argue that these parameters are high in
flavor aversion learning. Although taste aversions are definitely impressive
and can definitely be quick—especially under laboratory conditions—few
people consider this strong grounds for arguing that taste aversions are an
example of a qualitatively different form of learning.
Long-delay learning
The story is more interesting when one considers the delay between the
flavor and the US that allows flavor-aversion conditioning. There is no
doubt that this delay is unusual, but we need to consider why it occurs.
210  Chapter 6

The methods used in a taste aversion learning experiment are a little


special. Sam Revusky (1971) was among the first to note this, and he came
up with a rather interesting psychological explanation of why the methods
might allow learning over such long delays. A rat that learns an aversion
to saccharin over a 1-hour delay in the typical lab experiment has little to
do during the delay interval. That is, the animal might taste saccharin at
10:00 and then sit in the cage until 11:00, at which point it is made ill. Typi-
cally, the animal does not eat or taste anything during that period; often,
it probably falls asleep. If we imagine it waking up and thinking “It must
have been something I ate” when it is poisoned, there is little besides sac-
charin to blame the illness on.
Revusky captured this very loose idea by emphasizing
1.0 two things that are generally important in conditioning:
stimulus relevance and interference. We have consid-
ered stimulus relevance before: For a given US, only some
kinds of CSs are good (“relevant”) predictors. When ill-
0.8 ness is the US, the most relevant CS is a flavor, and the rat
No illness
therefore learns about flavors when it receives an illness
US. Revusky’s interference idea, in turn, is that relevant
stimuli tend to compete with one another for condition-
Preference for saccharin

0.6 ing. This idea should sound familiar; it is built into most
of the theories of conditioning, which have emphasized
effects like blocking. When our rat spends an hour in the
4.5% vinegar cage between ingestion of saccharin and inducement of
0.4 illness, there is no relevant flavor present to compete with the
conditioning of saccharin. Revusky (1971) showed that if a
No vinegar relevant extra CS (i.e., taste of a second flavor) is intro-
duced during the delay interval, there is considerably
0.2 less conditioning to the saccharin. The result of one of
Revusky’s experiments is shown in Figure 6.2. In it, rats
received a taste of saccharin (a small, 2-milliliter drink)
before receiving lithium chloride about 75 minutes later.
0 Fifteen minutes into the delay, the rats received 5 mls of
1 2 3 either vinegar or water. As the subsequent tests of sac-
Test trials
charin show, the vinegar taste interfered with the learned
Figure 6.2  Interference aversion to saccharin. (In other experiments, Revusky
reduces long-delay learning. In also showed that flavors occurring before the to-be-con-
these extinction test trials, one ditioned flavor likewise interfered with conditioning.)
group (No vinegar) showed a Thus, one thing that makes long-delay learning possible
strong aversion after a single
in flavor aversion learning experiments is that there are
trial in which saccharin was fol-
no relevant interfering cues.
lowed by lithium chloride after
a delay. When the rats also The story is different for other kinds of learning. If we
had a taste of 4.5% vinegar ran a comparable experiment with a tone at 10:00 and a
during the delay, however, the food pellet delivered at 11:15, many relevant noises and
aversion was markedly weak- sights (and perhaps the rat’s own behavior) would prob-
ened. (After Revusky, 1971.) ably intervene. Therefore, learning over long delays will
Are the Laws of Conditioning General?   211

be more difficult when relevant competing cues are more likely to occur
during the delay interval.
The Revusky hypothesis led to other experiments. Bow Lett Revusky
(Sam’s wife) ran some new experiments in the T-maze (e.g., Lett, 1973,
1977). Rats were trained to turn right or left in the maze, and then after
a long delay, the response was rewarded with sucrose. The trick, though,
was to remove the rat from the maze immediately after a response and
return it to the home cage. When the delay interval was over, the rat was
returned to the start box and given a reward (or not) as appropriate. With
this procedure, the rats learned to make the correct response with intervals
of at least 60 minutes between response and reward. Lett’s idea was that
removal from the maze immediately after the response limited exposure
to interfering cues that were “situationally relevant.” When control rats
spent part of the delay interval in the maze, the learning was not as good.
Other researchers have also produced evidence of surprisingly good learn-
ing with delayed reward in mazes (e.g., Lieberman, McIntosh, & Thomas,
1979). Thus, long-delay learning is not necessarily restricted to taste aver-
sion learning. And Revusky’s approach suggests that long-delay learning
involving flavor aversion can be predicted from general principles of con-
ditioning: interference and stimulus relevance.
Learned safety
Flavor aversion learning is presumably only one part of a whole system
that might have evolved so that omnivores can learn to discriminate foods
containing poison from foods that are safe to eat. It is possible to think of
the rat actively sorting foods as either dangerous or safe; thus, when the
rat encounters any new food, it might detect and register its consequences.
If illness occurs, an aversion will be learned, but if nothing happens, the
rat might learn that the new flavor is safe.
It is conceivable that this kind of process happens with many types of
stimuli; for example, wary rats might also judge noises and lights as safe or
dangerous vis-à-vis attack from a predator, but this kind of idea has enjoyed
special status when people think about flavor learning. Kalat and Rozin
(1973) suggested that this phenomenon has a role in long-delay learning.
They noted that Revusky’s theory implies that an aversion could form
over an infinite delay if there were no interfering relevant flavors. Such
an aversion typically does not form, however. Why not? Their answer was
that during the delay interval, the rat is gradually learning that the flavor
it tasted is safe, and over time, this knowledge increasingly gets in the way
of learning an aversion if illness does eventually happen. To illustrate the
phenomenon, Kalat and Rozin showed that whereas a rat given a taste
of sucrose at 4:00 and then made ill at 4:30 acquired a strong aversion to
sucrose, the aversion was not as strong if the rat had also had an earlier
exposure to saccharin at about 12:00. They suggested that as time elapsed
after the 12:00 exposure, the rat was learning that the substance was safe,
and this knowledge interfered with the aversion learning made possible
212  Chapter 6

when sucrose was again ingested at 4:00 and then illness occurred at 4:30.
This unusual perspective makes sense if one considers the rat’s problem
of discriminating safe and dangerous foods.
Unfortunately, there is another way to explain the effect of preexpos-
ing the rat to sucrose. Preexposure to the flavor at 12:00 might cause latent
inhibition, a very general phenomenon that can be understood without
supposing that the rat is learning safety at all (remember that one expla-
nation of latent inhibition is that attention to the CS just habituates). In
fairness to Kalat and Rozin, the importance of latent inhibition was not
well understood at the time they published their experiments. But Michael
Best (1975) saw the difference quickly. He reasoned that in aversion learn-
ing, learned safety would actually be analogous to conditioned inhibition;
that is, if the learned aversion is excitation, safety would be analogous to
inhibition. Accordingly, Best first showed that conditioned inhibition can
be produced in flavor aversion learning. He used the A+, AX– procedure
discussed in previous chapters. Rats received a saccharin flavor that was
paired with illness (A+); the pairing caused an aversion to saccharin. On
another trial, a salty taste (a saline solution) was presented just after the
rats received a short taste of saccharin; on this trial, the rat was not made
ill (AX–). The saline (X) was thus presented on an occasion when sac-
charin was not paired with poison. This method is a version of Pavlov’s
conditioned inhibition paradigm (or the feature-negative procedure); as
you know, in most conditioning methods, it would make the saline a con-
ditioned inhibitor. Now think about it: A conditioned inhibitor in taste
aversion learning would plausibly signal that the rat is now “safe” from
poisoning. And indeed, conditioned inhibition (A+, AX–) training makes
the rat actively prefer the inhibitor (X) to water. But Best went on to show
that preexposure to X before A+, AX– conditioning—which should have
boosted safety learning, according to Kalat and Rozin—actually interfered
with, rather than facilitated, learning inhibition to X. This does not make
sense if preexposure causes safety learning. Instead, preexposure to a taste
CS causes latent inhibition, a simple interference with conditioning, just as
it does in other conditioning preparations.
Best, Gemberling, and Johnson (1979) went on to view the Kalat and
Rozin phenomenon from the perspective of Wagner’s short-term memory
model: They suggested that the first exposure might prime the taste into
short-term memory, rendering it less surprising when it was presented
again later, which reduced the extent to which learning with it could occur.
An experiment they ran to test this hypothesis is illustrated in Figure 6.3.
Preexposure to a taste CS at 12:00 before conditioning at 4:00 (Group 2)
weakened aversion conditioning compared with a group that did not re-
ceive the preexposure (Group 1), as Kalat and Rozin had shown. However,
letting the rat taste a second flavor in between the prime at 12:00 and the
exposure at 4:00 improved the learning about it again (Group 3). Best et
al. (1979) argued that exposure to the second flavor knocked the prime
out of short-term memory and made it surprising again (see Chapter 4).
Are the Laws of Conditioning General?   213

(A) (B)

Consumption of vinegar (ml)


Group 1 15

12
Group 2 9
Second flavor
6
Group 3
12:00 4:00 3

0
Time 1 2 3
Group

Figure 6.3  (A) Time line of the experiment by Best, Gemberling, and John-
son (1979) on the “learned safety” effect. Group 2 received a priming taste of
vinegar 4 hours before it was paired with illness (•). Group 3 received the same
prime followed by a distracting vanilla taste shortly thereafter. (B) Results indi-
cated that the prime reduced conditioning of an aversion to vinegar, like Kalat
and Rozin’s original finding, but the distractor reduced the priming effect, as
predicted by Wagner’s short-term memory model of conditioning. (B, after Best,
Gemberling, & Johnson, 1979).

The interpretation assumes that short-term memory with flavors might last
quite a bit longer than short-term memory with other kinds of stimuli. (This
possibility could also contribute to why taste aversions can be learned with
long trace intervals.) General processes may apply, however. Indeed, Best et
al.’s study illustrates that taste aversion learning can be used as a method
to test implications of general theories of conditioning and learning.
Hedonic shift
If you have a conditioned taste aversion, you know that there is something
rather visceral and noncognitive about it. For example, tequila does not
seem like a signal for illness; instead, you just cannot stand the stuff. Garcia,
Hankins, and Rusiniak (1974) suggested that the main result of aversion
learning is a shift in the hedonic (pleasant or unpleasant) properties of the
flavor. The flavor is not a cue for a US in the way that a tone is a cue for a
food pellet. The main effect of aversion learning is to change the palatability
of the flavor from pleasant to noxious, a process known as the hedonic shift.
Subsequent research has borne this idea out. For example, several in-
vestigators have used an interesting method called the taste-reactivity
test (Grill & Norgren, 1978; Parker, 1982). The test is based on the fact that
rats react differently to flavors that differ in palatibility. For example, when
exposed to a tasty sucrose solution, they show a set of “yum” behaviors,
such as protruding their tongues and licking their paws. In contrast, when
they are exposed to bitter flavors, like quinine, they react differently; they
rub their chins on the floor and walls, they gape, and they shake their paws,
214  Chapter 6

Chin rubs Gapes Paw shakes


30 30 30

25 25 25
Mean frequency of reaction 0.5% quinine
20 20 20

Sucrose LiCl
15 15 15

10 10 10

5 5 5
Sucrose Saline
0 0 0
1 2 3 4 1 2 3 4 1 2 3 4
Conditioning trial

Figure 6.4  Results from a taste-reactivity test. When tasty sucrose is paired with
illness, the rat begins to respond to it as if it were bitter. Red circles indicate re-
sponding to sucrose over trials on which it was paired with lithium chloride. The
dashed line indicates unconditional responding to the taste of a bitter quinine
solution. (After Parker, 1998.)

all of which are essentially “yuck” responses. The discovery is that if we


give a rat a tasty sucrose flavor and pair it with illness, the rat first reacts to
the sucrose with positive “yum” responses. But on testing after condition-
ing, it shows the “yuck” pattern of responding, just as it does to a bitter
taste (Figure 6.4). It is as if aversion learning has turned the tasty sucrose
flavor into a bitter one. Interestingly, Linda Parker has shown that this sort
of effect occurs with some USs (like injections of lithium chloride), which
make a rat sick. But it does not occur with other USs (such as injections of
amphetamine or cocaine), which do not make a rat sick, even though these
USs will also condition suppressed consumption of the flavor (e.g., Parker,
1982, 1988, 1995, 1998). Thus, learning can suppress flavor consumption
through at least two mechanisms, one of which involves a hedonic shift
and another that does not. But when we are dealing with agents that cause
sickness, the taste aversion is associated with a shift in the flavor’s hedonic
properties. The rat reacts as if the flavor is now bitter. Interestingly, flavors
associated with calories, a positive US, likewise show an increase in positive
palatability reactions, as if a neutral—or even aversive—flavor has become
sweet (Forestell & LoLordo, 2003; Myers & Sclafani, 2001).
The question is whether hedonic shift makes flavor learning unique.
You will remember from Chapter 5 that the nature of the CR in all condi-
tioning preparations depends on the nature of both the CS and the US. In
a broad sense, the same is also true in taste aversion learning. That is, the
combination of a flavor CS and an illness US seems to support a particular
Linda Parker pattern of behavior—suppressed consumption and a negative palatibility
Are the Laws of Conditioning General?   215

reaction—just as other CS-US combinations cause their own characteristic


patterns. All CRs are unique; thus, it is easy to accept the idea that flavor
aversions cause a hedonic shift without claiming a radical new law of
learning and behavior.
The question can also be approached from another angle, however. The
idea is that taste aversions involve only a hedonic shift; the flavor is not
supposed to be treated as a signal for the US. Other forms of conditioning
clearly can involve the subject learning that the CS is a signal for the US.
For example, when humans receive pairings of a CS and an uncomfortable
(but not painful) electric shock to the finger, the CS will elicit palm sweating
(an “electrodermal” CR), and the participant will report an expectancy of
shock (e.g., Lovibond, 2003). The person’s ability to report the expectancy
indicates some awareness that the CS signals shock. Some writers have ar-
gued that such awareness is actually necessary to get a CR in humans (e.g.,
Lovibond & Shanks, 2002), although the idea is not accepted universally
(see Clark & Squire, 1998; Manns, Clark, & Squire, 2002; Öhman & Mineka,
2001; Wiens & Öhman, 2002). Interestingly, verbal instructions can have a
powerful effect on the electrodermal CR. For example, if humans are told
about changes in the CS-US contingencies after conditioning, the verbal
input can substantially modify the CR (Lovibond, 2003; see also Cook &
Harris, 1937; Dawson & Schell, 1985). But I know of no similar experiment
showing that verbal input can likewise influence a conditioned taste aver-
sion. Indeed, the idea seems implausible; for example, despite repeated
conversations throughout my childhood, I do not think that my mother
ever talked me out of my taste aversion to broccoli. The point is that taste
aversions do not seem easy to penetrate with verbal input; they seem to be
a more visceral form of learning than electrodermal conditioning.
There are findings with animals that also suggest that a taste aversion
is rather visceral and noncognitive. After animal conditioning occurs, we
can “revalue” the US in a number of different ways; for example, we can
habituate the US or reduce the original magnitude of the US in some other
fashion (a “deflation” treatment, e.g., Rescorla, 1973; Holland & Rescorla,
1975b; Holland & Straub, 1979; see Chapter 3). Alternatively, we can expose
the subject to a series of larger USs (an “inflation” treatment, Bouton, 1984;
Rescorla, 1974). In fear conditioning and appetitive conditioning, these
treatments influence the strength of the CR. That is, exposure to weakened
USs after conditioning decreases the strength of the CR, and exposure to
larger USs increases the strength of the CR. If we change the representation
of the US after conditioning, responding to the CS changes accordingly.
These results thus suggest that in fear conditioning and appetitive condi-
tioning, the CS does indeed act as a signal for a representation of the US.
Little work has been done on these paradigms in taste aversion learning,
but at least some of it suggests that taste aversions are not revalued in the
same way. Perhaps the flavor is treated as something different from a signal
for the US. Jacobs, Zellner, LoLordo, and Riley (1981) ran a deflation experi-
ment in which rats first received a pairing of saccharin with an injection of
216  Chapter 6

morphine that made them sick. The conditioning trial created an aversion
to the saccharin. In the next phase, the rats received repeated injections
of morphine. As you know by now, repeated morphine injections make
the body tolerant to the drug, and the drug thus becomes less aversive
than before. In fact, in Jacob et al.’s (1981) experiment, there was evidence
that the animals became dependent on the morphine. To risk being a little
anthropomorphic, at the end of the tolerance phase there was a sense in
which the rats probably liked the morphine; in fact, they probably would
have sought it out. Interestingly, even though the treatment revalued the
rats’ reaction to the morphine US, a final test showed that the saccharin
aversion was not affected. In this sense, the rats did not behave as if they
treated the saccharin as a signal for the US.
I ran some related unpublished experiments with Leigh Stockton many
years ago. Instead of deflating the US, we inflated it. Rats received pairings
of saccharin with a weak dose of lithium chloride and then several injec-
tions (spaced over many days) of a larger, stiffer dose. This method had no
effect on the strength of the aversion when we tested it again. (DeCola and
Fanselow, 1995, suggested that inflation effects can occur in taste aversion
learning, but not if there is a delay between the CS and US on the condition-
ing trial.) Our results, in a tentative way, are again consistent with the idea
that flavor-illness pairings might not always make the flavor a “signal” in
the same sense that it is a signal in fear or appetitive conditioning.
An important point is in order here, though. We do not, in fact, know how
“general” the US revaluation effect is. It has not been studied much outside
of fear conditioning and appetitive conditioning. For example, one would
be hard-pressed to find an experiment involving US revaluation in eyeblink
conditioning. And even in the fear and appetitive conditioning preparations,
there are some differences. For example, in fear conditioning, US deflation
works with first-order conditioning (Rescorla, 1973), but not with second-
order conditioning (Rizley & Rescorla, 1972). In appetitive conditioning with
rats, the same pattern appears to hold (e.g., Holland & Rescorla, 1975b); but
in autoshaping with pigeons, deflation effects have been observed (Rescorla,
1979). There is a possibility that conditioning can take the form of S-S or
S-R learning in different conditioning systems and preparations, but this
question has not really been explored. For the time being, we may note that
taste aversion learning might be represented differently than first-order fear
conditioning or first-order appetitive conditioning.
Compound potentiation
John Garcia’s laboratory dropped another bombshell in the late 1970s. In
several new experiments (Rusiniak, Hankins, Garcia, & Brett, 1979), rats
were given a dilute solution of almond extract mixed in water (Figure
6.5A). Although the rats took the solution into their mouths, the almond
extract produced an odor; it was detected retronasally through the nasal
holes that rats (and humans) have in the back of the mouth. When pre-
sented alone before illness, the odor acquired only a weak conditioned
Are the Laws of Conditioning General?   217

(A) Odor from drink (B)


Spout
16
14
Water or 12
+ Odorant
saccharin

Licks ( × 100)
10
8
6
Odor from cup 4
Odorant Spout 2
Cup
0
O-O OT-O

Water or
saccharin

Figure 6.5  Potentiation of odor conditioning by taste. (A) A rat is given a taste
in drinking water that also contains an odor. (In many experiments, the odor
comes from a cup near the spout instead of being mixed in the drink.) (B) When
odor is paired with illness on its own and then tested (O-O), consumption is not
suppressed much, but if it has been combined with a taste on the conditioning
trial (OT-O), strong odor conditioning is obtained. (A, after Inui, Shimura, &
Yamamoto, 2006; B, after Palmerino et al., 1980.)

aversion. But when it was combined with a saccharin taste so that the rat
drank a compound almond-saccharin mixture that combined the almond
odor with sweet (roughly the flavor of marzipan), a remarkable new thing
occurred: The saccharin taste, known to be a salient CS for illness, actually
increased the conditioning acquired to the almond odor (Figure 6.5B). This
result, called compound potentiation, is important because it is precisely
the opposite of what most conditioning theories predict. For example, the
Rescorla-Wagner model assumes that CSs presented in a compound com-
pete with one another for conditioning, and they must share the overall
amount of conditioning that is supported by the US. In fact, all the models
considered in Chapter 4 predict that when a salient CS is compounded
with a weaker CS, it should cause overshadowing: The stronger CS should
reduce the associative strength acquired by the weaker element. All the
models we have considered predict competition and overshadowing, rather
than potentiation, of odor conditioning by the taste.
Other research has established that the effect also occurs when the odor
is presented on a cup behind the drinking spout rather than being mixed
directly in the water (see Figure 6.5A). Garcia and his colleagues provided
several explanations that all emphasized the different functional roles that
odors and tastes have. One early view (Rusiniak et al., 1979) held that odor
218  Chapter 6

is a distal cue that controls the rat’s approach to foods, whereas taste is a
proximal cue that guides consumption. Because there was no redundancy
in function, perhaps there was no competition. Another view (e.g., Garcia,
1989; Palmerino, Rusiniak, & Garcia, 1980) emphasized that tastes are im-
portant in defending the gut from poison—the “gut defense” system. Odor,
on the other hand, is connected with many things besides foods; odors
mark the presence of other rats, potential predators, potential mates, and so
forth. In fact, odors are sometimes described as part of the “skin defense”
system. The effect of the taste was to make the odor a food cue, allowing
it access to the gut defense system. That is what made it more connectable
with poisoning, particularly over long delays. The approach emphasizes
the special functional roles played by odor and taste.
The potentiation effect generated some intense research and debate. It
was soon shown that tastes could also potentiate conditioning to contextual
cues (e.g., Best, Batson, & Bowman, 1990; Best, Batson, Meachum, Brown,
& Ringer, 1985; Best, Brown, & Sowell, 1984). In these experiments, either
saccharin or water was presented in a particular box, and then the rat was
made ill. When rats later received a different, palatable solution (e.g., saline)
to drink in that same box, consumption was more suppressed in the group
that had received the saccharin there during conditioning. The saccharin
had potentiated conditioning of the box. Presumably, context cues differ
from taste cues in the same way that odors differ from tastes in that context
cues are exteroceptive; they are distal cues that are not ordinarily used in
“gut defense.” More recent research has also discovered other interactions
between odors and tastes that suggest that they do not merely compete
with one another when they are conditioned in compound (e.g., Allswede,
Curley, Cullen, & Batsell, 2014; Batsell & Batson, 1999; Batsell, Paschall,
Gleason, & Batson, 2001; Batson & Batsell, 2000).
Durlach and Rescorla (1980), however, pointed out that potentiation
could be understood in terms of general conditioning laws. When taste and
odor (or taste and context) are combined, the animal has an opportunity
to associate them. The animal might form a within-compound associa-
tion; that is, it might form an association between the two elements in the
compound, which can happen widely in compound conditioning experi-
ments (e.g., Rescorla & Durlach, 1981). Therefore, any conditioning that
might accrue to the taste is readily transferred to the odor, a kind of guilt by
association. Consistent with this idea, Durlach and Rescorla (1980) found
that extinction of the taste after odor-taste conditioning reduced aversion to
the odor (but see Lett, 1984). Similar effects of extinguishing the taste have
been observed in taste-context potentiation (e.g., Best et al., 1985). Recent
experiments have also shown that further conditioning of the taste also fur-
ther strengthens aversion to the odor (Batsell, Trost, Cochran, Blankenship,
& Batson, 2003). All these results are consistent with a within-compound
association explanation of potentiation, but there are still problems. Per-
haps most important, animals appear to form within-compound associa-
tions whether potentiation or overshadowing is observed (e.g., Rescorla
Are the Laws of Conditioning General?   219

& Durlach, 1981). Therefore, although within-compound associations may


sometimes be involved in potentiation, they are not sufficient to produce it.
There was other trouble as well. A number of reports emerged that
failed to produce odor potentiation by taste (e.g., Bouton & Whiting, 1982;
Mikulka, Pitts, & Philput, 1982); in fact, in a series of studies, taste con-
sistently overshadowed the conditioning of odor (e.g., Bouton & Whiting,
1982). Eventually, experiments showed that the odor had to be extremely
nonsalient—very weakly conditionable on its own—for potentiation to
occur. When the odor in the drink was made very weakly conditionable by
presenting it in an even more dilute solution (so dilute that it was 1/400th
the concentration originally used by Rusiniak et al., 1979), odor potentia-
tion by taste occurred (Bouton, Jones, McPhillips, & Swartzentruber, 1986).
It turns out that the potentiated cue’s weak conditionability is far more
important than its status as an odor or a taste. When a saline solution was
similarly diluted so that it was a very weak cue for illness, this taste was
potentiated by saccharin in further experiments (Bouton, Dunlap, & Swart-
zentruber, 1987). Even odors can potentiate tastes (and tastes can potentiate
odors), provided that a strongly conditionable cue is combined with a more
weakly conditionable target (Slotnick, Westbrook, & Darling, 1997). Thus,
what appears to matter is weak conditionability of the target CS.
Urcelay and Miller (2009) accepted this idea and went on to demonstrate
that potentiation can even occur in fear conditioning. In their experiments,
a weakly conditionable soft noise was paired with shock either alone or
in combination with a louder and more salient tone. When the shock was
presented immediately after the CS (in a delay conditioning procedure),
the tone overshadowed conditioning of the noise. But when the shock did
not occur for 10 or 20 seconds after the CS went off (in a trace conditioning
procedure), the tone potentiated conditioning with the noise. Batsell et al.
(2012) found analogous results in flavor aversion experiments: Potentiation
occurred when there was a long interval between the flavors and illness (e.g.,
120 minutes), but overshadowing tended to occur when there was a short
interval (e.g., 0 minutes). They suggested that conditioning of a weak cue is
ordinarily very poor with a long delay interval, but that combining the weak
cue with an additional strong cue creates a salient configural stimulus (see
Chapter 4) that is better learned about over the long interval. Their account
attempted to explain potentiation with general principles of conditioning.
Where does this leave us? One point is that both overshadowing and
potentiation can occur in compound conditioning, and that conclusion is
important because models of conditioning (see Chapter 4) have tended
to focus on competition between cues. On the other hand, potentiation
is not a freak result that is connected with certain combinations of odor
and taste (or context and taste). It appears to happen when a very weakly
conditionable cue is combined with a more salient cue. Other studies also
suggest that a salient stimulus can enhance conditioning of a weaker CS in
autoshaping (Thomas, Robertson, & Lieberman, 1987) and learning about
locations in a maze (e.g., Graham, Good, McGregor, & Pearce, 2006). In-
220  Chapter 6

stead of indicating that aversion learning obeys special laws, compound


potentiation challenges us to come up with better general laws.
Conclusion
By the late 1970s and early 1980s, taste aversion learning had been absorbed
into mainstream learning theory. It is now viewed as another example of
associative learning. There will probably always be differences in the condi-
tions that lead to better eyeblink conditioning, fear conditioning, and taste
aversion learning, all of which are potentially illuminating. For example,
we may discover that the preparations vary in whether they are examples
of S-R or S-S learning. But the differences are not necessarily threats to a
general understanding of learning. Chicken Little (Seligman, 1970) was
wrong; the sky never fell.
Taste aversion learning did bring some very significant changes to the
field, though. Students of learning theory now recognize that evolution
does influence learning; stimulus relevance in particular is now a gener-
ally accepted aspect of conditioning and associative learning. In addition,
there is an interest in the function of conditioning, as exemplified by the
perspective of learning as an adaptation process (discussed in Chapter 2)
and by the current thinking about the processes that influence the form of
the CR (see Chapter 5). Taste aversion learning added a fresh biological
and functional perspective to how we view conditioning. It did not lead
to reinvention of the wheel; it led to a better wheel.

Some Reasons Learning Laws May Be General


There is thus some generality to the laws of learning after all. Yet, if we
accept the fundamental idea that learning processes evolved because the
ability to learn is functional (see Chapter 2), why should this be? An ani-
mal’s ability to sense and receive stimulus input from the external world
is also functional, in a general way. Yet animals are not equipped with a
general sensory system; instead, evolution has given them separate vision,
auditory, and other sensory systems. Why shouldn’t analogous divisions
occur in learning systems? Why should there be a general learning process?
Evolution produces both generality and specificity
The answers to these questions are complex and far from resolved. But it
is important to understand that an evolutionary approach to learning can
accept general learning laws as well as adaptive specializations. When
most of us try to see things from an evolutionary perspective, we tend to
see traits as adaptations—things that were selected directly for the func-
tion they now perform. But evolution does not work in only this way.
Sometimes a functional trait is an exaptation, a characteristic or feature
that was not selected directly for the job it currently performs (e.g., Gould,
1991; Gould & Vrba, 1982; see also Buss, Haselton, Shackelford, Bleske, &
Wakefield, 1998). The general idea is that a trait adapted for solving one
Are the Laws of Conditioning General?   221

particular problem is sometimes good enough at solving another problem,


too. In this case, the trait would solve two problems, but has been adapted
for only one. One example of an exaptation is bird feathers. Today, feath-
ers function to help birds fly, but they were probably selected for the very
different function of keeping animals warm. Once early feathers were in
place, though, they probably helped the animals move around, too. The
feathers were thus co-opted for flight. The flight function of feathers there-
fore started as an exaptation, not an adaptation.
Sherry and Schacter (1987) noted that the exaptation concept may
help explain why general-purpose learning and memory processes have
evolved. A learning mechanism that was adapted to solving one problem
might work reasonably well for solving another and simply be co-opted.
Sherry and Schacter observed that for true adaptive specializations in learn-
ing and memory to evolve, it is probably necessary that a mechanism for
solving one problem be functionally incompatible with solving others. For
example, a learning mechanism that allows habits to build up incrementally
over trials might be incompatible with a learning mechanism that responds
to the need to remember a once-experienced place or episode. The point
is that functional incompatibility between the two problems might be re-
quired to force the evolution of separate mechanisms. The solution to the
food selection problem that taste aversion learning provides is probably not
functionally incompatible with other problems that classical conditioning
mechanisms solve. As Sherry and Schacter note, “An evolutionary analysis
indicates that both generality and specificity must be expected to occur as
characteristics of memory and learning” (Sherry & Schacter, 1987, p. 449).
There is another reason that learning principles can be general. It is
possible for two learning mechanisms to evolve separately but converge
on similar principles because the problems that they solve are inherently
similar. The bird’s wing and the bee’s wing have many common properties,
even though these appendages evolved independently. In this case, the
traits are said to be analogous rather than homologous (they are similar
but have separate evolutionary origins). Learning principles may thus be
rather similar (analogous) from situation to situation because the problems
that they deal with are similar. For example, it is generally true that things
that occur together in time or space generally go together. Learning has
evolved to help animals understand the causal structure, or causal links,
between things in their world (Dickinson, 1980); there may be generality
to that structure. Of course, a number of other general functions of operant
and classical conditioning were considered in Chapter 2.
Still, Sherry and Schacter (1987) noted at least one place where there might
be a functional incompatibility: incremental habit learning versus memory
for episodes that happen just once. Perhaps there are more incompatible
systems than just these. Human memory researchers have often divided
memory into several different systems, including those that handle knowl-
edge about facts, short-term information, habits, autobiographical episodes,
and more (e.g., Squire, 1987; Tulving, 1972; see Klein, Cosmides, Tooby, &
222  Chapter 6

Chance, 2002, for discussions). It is currently popular to think that the “mind
is like a Swiss army knife, a general purpose tool made of many specialized
parts” (Shettleworth, 1998, p. 567). The specialized parts are sometimes called
modules, or cognitive mechanisms that are designed to handle specific types
of input (e.g., see Shettleworth, 2002). Modules are thought to be relatively
“encapsulated”; that is, they are unaffected by other modules.
Öhman and Mineka (2001) argued that fear learning in humans is con-
trolled by a “fear module” that is selective to its own “prepared” type of
environmental input (we seem prepared to associate snakes and spiders—but
not flowers or mushrooms—with fear-arousing USs). It is also thought to
be encapsulated in the sense that it is not affected by conscious influences.
The module nonetheless operates according to familiar classical condition-
ing laws, with “preparedness” being one of them. Thus, it might not be the
case that different modules that have been hypothesized are truly function-
ally incompatible in the Sherry and Schacter sense. Because of the way the
world is generally organized, perhaps the associative learning rules that are
represented in classical conditioning may generalize across many domains.
The generality of relative validity
It does appear that the laws of conditioning can sometimes have aston-
ishing generality. One surprising example is the relative validity effect
described in Chapter 3 (Wagner et al., 1968), which illustrates the point that
conditioning occurs to the extent that the CS provides information about
the US. In the basic experiment, one group (Group Correlated) received
a mix of AX+ and BX– trials. During later testing, the animals showed
strong learning to A and relatively little to X, the latter of which was less
informative about the US. The other group (Group Uncorrelated) received
very similar trials except that AX and BX were both paired with the US half
the time (the animals received AX+, AX–, BX+, and BX– trials). This group
showed better learning to X, even though it was paired equally often with
the US, because A and B were not more informative. Wagner et al. (1968)
reported the relative validity effect in experiments on fear conditioning in
rats, operant learning in rats, and eyeblink conditioning in rabbits. It was
also later shown in pigeon autoshaping (Wasserman, 1974). The results of
one of the original experiments, which we already saw in Chapter 3, are
summarized again in Figure 6.6A.
What is remarkable about this result is that the relative validity effect
has been demonstrated in an even wider range of species using a variety
of conditioning preparations. For example, it has been shown in several
experiments with human participants (Baker, Mercier, Vallee-Tourangeau,
Frank, & Pan, 1993; Shanks, 1991; Wasserman, 1990). David Shanks brought
human participants into the lab and asked them to pretend that they were
doctors diagnosing diseases. The participants saw a series of medical cases
presented on a computer screen. In each case, a fictitious person was de-
scribed as having one or several symptoms, such as puffy eyes, swollen
glands, or sore arms. The person might also have a fictitious disease, such
Are the Laws of Conditioning General?   223

(A) Rabbits (B) Humans (C) Honeybees

90 81.2 90 20
18 U

Cumulative frequency
16
70 70
Responses (%)

14

Judgment
49.0 12
50 50
10
32.3 8 C
21.9
30 30 6
4
10 10 2
0
Uncorrelated Correlated Uncorrelated Correlated 5 10 15 20
30-second intervals

Figure 6.6  Results of relative validity experiments in rabbits, humans, and hon-
eybees. In all species and tasks, a cue acquires less associative strength when
it is combined with more valid predictors of the US during learning (the Cor-
related condition) than when it is not. (After Wagner et al., 1968, Shanks, 1991,
and Couvillon et al., 1983, respectively.)

as Dempes disorder or Phipp’s syndrome. The participants were first given


the symptoms and were then required to guess which disease each patient
had. Over trials, they learned that certain symptoms were associated with
certain diseases. In this situation, each medical case is analogous to a con-
ditioning trial—each symptom is a CS and each disease is a US.
Several classic conditioning effects have been demonstrated with this
method. In one experiment, Shanks (1991) repeated the relative validity
conditions used by Wagner et al. (1968). In the Correlated condition, Dem-
pes disorder occurred with symptoms A and X (AX+), but not with symp-
toms B and X (BX–). In the Uncorrelated condition, the disease occurred
half the time with A and X (AX+, AX–) and half the time with B and X (BX+,
BX–). Once again, a potential predictor (X) was paired with a “US” half
the time in both conditions. At the end of the experiment, Shanks asked
the participants, “If you were to see 100 patients, each of which had puffy
eyes, how many would you estimate would have Dempes disorder?” Fig-
ure 6.6B shows the data for the symptom that corresponded to CS X. The
results were like those seen previously in rats, rabbits, and pigeons: People
underestimated the symptom that was connected with more informative
predictors in the Correlated condition.
The results have also been shown in another remarkable animal, the
honeybee (Couvillon, Klosterhalfen, & Bitterman, 1983; Figure 6.6C). Late
in their lives, honeybees leave the hive and forage among flowers for nectar
(which provides energy) and pollen (which provides protein). To forage
efficiently, the honeybee learns to associate the color and odor of each type
of flower with the reward contained within it. Consequently, on a given
224  Chapter 6

foraging trip away from the hive, a honeybee tends to visit only similar
flowers. Associative learning is undoubtedly very important in the behav-
ior of these insects.
The late Jeff Bitterman, Pat Couvillon, and their associates at the Uni-
versity of Hawaii have run many fascinating conditioning experiments
with honeybees; they conduct their experiments in a laboratory that is
open to the outdoors (Figure 6.7). Couvillon et al. (1983) caught individual
honeybees near a hive that the investigators maintained. (The bees were

(A) (B)

(C) (D)

(E) Figure 6.7  Conditioning with free-


flying honeybees. (A) Subjects are
captured outdoors near the hive at
a station providing sucrose. (B) The
experiment is then run on a modified
windowsill in the lab. (C) The subject
is placed on a petri dish that may have
a bright color, an odor, or both; those
are CSs. (D) The subject flies between
the hive and the CS for repeated
conditioning trials. (E) A rewarded trial:
The subject feeds on sucrose in the
middle of a petri dish. (Photographs
courtesy of Pat Couvillon.)
Are the Laws of Conditioning General?   225

attracted to a feeding station where a sucrose solution could be obtained.)


After capture, each bee was marked with a little nail polish for identifica-
tion and then transported a short distance to a windowsill of the labora-
tory where the experiment was run. A bee given a drop of sucrose at the
windowsill would fly back to the hive and then repeatedly return to the
windowsill for more. Each visit allowed a new conditioning trial. On each
visit, a drop of sucrose was presented on the lid of a petri dish that might
be colored and/or have a scent. Different colors and odors thus provided
Jeff Bitterman
CSs; the honeybees readily approached the petri dish lids once the bees
had associated the colors and odors with the sucrose US.
Couvillon et al. (1983) gave two different groups of honeybees the Cor-
related and Uncorrelated treatments. On every trial, bees in both groups
saw an orange disk in the center of the petri dish; the orange-colored disk
played the role of X. On some trials, it was combined with a jasmine odor
(A); on other trials, it was combined with a violet odor (B). As above, half
the trials were paired with the US (sucrose), and half were not (water, in-
stead of sucrose, was present in the dish). The Correlated group received
the AX+ BX– treatment, whereas the Uncorrelated group received AX and
Pat Couvillon
BX with and without sucrose half the time (as usual). In a final test, the
orange disk (X) was presented alone, without either of the odors, for 10
minutes. Couvillon et al. measured the number of times that the honeybees
contacted the orange disk (stronger conditioning leads to more contacts).
As Figure 6.6C shows, the relative validity result was obtained: Bees in
the Correlated condition were less likely to treat Orange (X) as a strong
CS for sucrose.
The honeybee result is especially interesting because the bee’s brain has
evolved separately from the brains of the vertebrate animals that we have
usually considered in this book (e.g., rats, rabbits, and humans). As Bitter-
man once said, the common ancestor of humans and honeybees might have
had one synapse: “hardly any brain at all” (Bitterman, 2000, p. 71). Thus,
it seems truly remarkable that the honeybee reacts to the contingencies of
the relative validity experiment in such a comparable way. Despite such
different brains, the behavioral principles describing learning seem similar;
indeed, there is an almost amazing generality to the relative validity effect.
The similarity of results across species and methods makes it credible to
think that whatever the function the laws of conditioning were originally
adapted to, they may occur rather widely in the animal kingdom. Let us
therefore look a little deeper at associative learning in honeybees and hu-
mans and ask whether the same rules of learning really do apply.

Associative Learning in Honeybees and Humans


Conditioning in bees
The learning and behavior of foraging honeybees has also been studied
by ethologists—those who study the evolution of animal behavior in the
natural environment (e.g., Gould, 1991; Menzel, Greggers, & Hammer,
226  Chapter 6

1993). Ethologists might not be so surprised if learning processes differed


across species, although they do see important commonalities in learning
(e.g., Gould & Marler, 1987). A better way to characterize the ethologists’
approach is to say that they tend to emphasize that learning fine-tunes
an animal’s natural behavior and built-in predilections. For example, we
might find the honeybee naturally approaching and ready to learn about
color and odor stimuli (which mark flowers) because such cues have reli-
ably predicted the location of nectar over many generations of bees. In ef-
fect, learning might allow bees to know which colors and which odors are
relevant in their search for food. Although this view is sometimes seen as
fundamentally different from that taken by psychological learning theory
(Gould, 1991), I hope you recognize that it is not. This book has accepted
both phylogenetic selection of certain CSs over others (through prepared-
ness or stimulus relevance) as well as ontogenetic selection (created by an
individual’s experience over time).
The question is whether the principles that describe the foraging bee’s
learning about flowers are qualitatively similar to what we have been talk-
ing about here. To a large extent, the research suggests that the answer is
yes. Honeybees clearly do learn about flowers, and the results of a large
number of experiments suggest that their learning shows striking parallels
with the principles discussed in this book (for reviews, see Bitterman, 1988,
1996; Giurfa, 2008; Giurfa & Sandoz, 2012). For example, honeybees have
shown many classic compound conditioning effects—including summa-
tion, overshadowing, sensory preconditioning, second-order conditioning,
negative patterning, positive patterning, and compound potentiation—in
addition to the relative validity effect described above (see Figure 6.6C).
They also show many other phenomena, such as latent inhibition and ex-
tinction—along with reinstatement and spontaneous recovery effects (e.g.,
see Eisenhardt, 2014; Plath, Felsenberg, & Eisenhardt, 2012). It does not
seem unreasonable to think of the honeybee foraging in a world of com-
pound flower CSs that guide its approach behavior following the rules
described in the preceding chapters.
The generality also goes beyond effects that we have discussed so far.
For example, bees appear to have short-term memory: Using methods
similar to ones used to study short-term memory in pigeons (see Chapter
8), Couvillon, Ferreira, and Bitterman (2003) showed that bees can use
their memory of what color was encountered a few seconds before to de-
termine which of two colors to approach to find a reward (see also Giurfa,
Zhang, Jenett, Menzell, & Srinivasan, 2001). Recent experiments suggest
that bees can also learn to approach the odd visual stimulus out of dis-
plays of three stimuli when the stimulus combinations are unique from
trial to trial (Muszynski & Couvillon, 2015). Honeybees also demonstrate
some classic “paradoxical reward effects” that will be covered in Chapter
9 when we consider interactions between learning and motivation. In rats
and other mammals (Bitterman, 1988), the presentation of a big reward
can paradoxically decrease the positive value of a smaller reward when
Are the Laws of Conditioning General?   227

it is presented later (see Flaherty, 1996, for a review). To put it anthropo-


morphically, getting a small reward after a big one can be disappointing.
For example, when rats have had experience drinking a tasty 32% sucrose
solution, they drink less of a 4% sucrose solution (still sweet, but not as
sweet) than animals that have had 4% all along (e.g., Flaherty, Becker, &
Checke, 1983). It is as if animals become frustrated (Amsel, 1992) when a
reward is smaller than expected. A similar successive negative contrast
effect, and other related effects, has been observed in honeybees (e.g., Cou-
villon & Bitterman, 1984; Couvillon, Nagrampa, & Bitterman, 1994; Loo
& Bitterman, 1992). Interestingly, however, the goldfish, a vertebrate that
is more closely related to mammals than is the honeybee, does not show
these effects (e.g., Couvillon & Bitterman, 1985). The contrast effect has
probably evolved quite independently in bees and mammals.
Still, it would be a great mistake to think of the honeybee as a simple
little rat or human. Not everything works in bees the way you might expect.
For example, conditioned inhibition produced in the classic conditioned in-
hibition or feature-negative (A+/AX–) procedure has been difficult to show
in honeybees despite several attempts to find it (Couvillon, Ablan, & Bit-
terman, 1999; Couvillon, Ablan, Ferreira, & Bitterman, 2001; see also Cou-
villon, Hsiung, Cooke, & Bitterman, 2005), although an unusual variation
of the procedure produced some positive results (Couvillon, Bumanglag,
& Bitterman, 2003). Inhibition is so important to theories of conditioning
that failure to find it could turn out to be fundamental. It is also possible,
however, that we have not run the right experiment yet. For example, the
blocking effect, also central to our understanding of learning in vertebrates,
was once thought to be absent in bees. Blocking did not seem to occur in
bees when the compound involved both a color and an odor (Funayama,
Couvillon, & Bitterman, 1995), although it did occur if the compound was
made up of two CSs from the same modality, such as odor-odor or color-
color (Couvillon, Arakaki, & Bitterman, 1997; Smith & Cobey, 1994). Later
results suggest that blocking can be obtained with CSs from different mo-
dalities as long as the blocking CS is relatively salient (Couvillon, Campos,
Bass, & Bitterman, 2001). Despite the occasional difference between bee
learning and vertebrate learning, “the differences are far outweighed by
the similarities” (Bitterman, 2000, p. 71).
In the end, although it is best to acknowledge that both similarities
and differences are present in what we know about learning in different
nonhuman animals, there is impressive generality to known conditioning
principles. On the other hand, we must also acknowledge that any one be-
havioral effect (e.g., overshadowing, blocking, relative validity) can result
from several processes. This idea is not new; we saw in earlier chapters
that the blocking effect, even in one species, might be controlled by more
than one psychological process. Perhaps some of these processes play a
larger role in some species or in some conditioning preparations than in
others. This question will stimulate more research that will continue to
broaden our perspective on learning. It nonetheless seems clear that models
228  Chapter 6

of conditioning are an excellent place to start if one wants to study and


understand learning in other species and in other preparations.
Category and causal learning in humans
Conditioning principles have also been important in recent work on asso-
ciative learning in humans (e.g., Shanks, 1995, 2010). There are two situa-
tions in which conditioning theories have been seen as especially relevant.
First, in category learning, humans (or animals; see Chapter 8) learn to
classify items as belonging to one category or another. For example, does
this patient have our fictitious Dempes disorder or the dreaded Phipp’s
syndrome? Is this thing in front of me a cat, a dog, or a bagel? As we saw
in Chapter 1, one approach is to think that all items are made up of many
features and that categorization depends on learning to associate the right
features with the right categories. The learning of feature-category associa-
tions in humans is similar to the learning of CS-US associations in animals.
Second, in causal learning, people learn about the causes of certain
effects. Here again, the idea is that certain cues (causes) become associ-
ated with certain outcomes (effects). For example, some important early
experiments involved human participants playing a video game (Dickin-
son, Shanks, & Evenden, 1984). In this game, military tanks occasionally
rolled across the screen. Pressing the space bar key at the computer ter-
minal caused shells to be fired at these tanks, which might then explode
(Figure 6.8) either because of the shell that was fired (a kind of causal
CS) or because of mines that were hidden in a mine field (another, com-
petitive, causal CS). At the end of many trials, the participant was asked
how effective the shells were in making the tank explode. The method
can be expanded in different ways; for example, there can be other causes
of explosions, such as planes flying overhead (e.g., Baker et al., 1993).
Subsequent associative learning experiments with humans have used
more modern and sophisticated computer-game graphics (e.g., Nelson
& Lamoureux, 2015; Nelson & Sanjuan, 2006). In causal learning tasks,
like categorization tasks, the human participant must sort out the con-
nections between cues and outcomes, much as animals do in classical
conditioning experiments.
Remarkably, experiments with these tasks have uncovered further cor-
respondences, in addition to the relative validity example discussed above,
between how humans and animals learn (e.g., see Shanks, Holyoak, &
Medin, 1996, for a series of papers on the subject; and DeHouwer & Beck-
ers, 2002, for a review). For example, humans show blocking effects. The
participant might first be shown a series of trials in which the on-screen
military tank explodes because of mines; in a second phase, the participant
might now be able to blow up the tank with the shell. The earlier phase
reduces the participant’s belief in the effectiveness of the shell. Humans
also show conditioned inhibition. Here, the participant might be shown
trials with the mines exploding tanks, and when the shell is shot, no ex-
plosion occurs. Participants given such treatments tend to rate the shell’s
Are the Laws of Conditioning General?   229

(A) (B)

You may now fire You may now fire

(C) (D)

You may now fire You may now fire

Figure 6.8  Shell or mine? Computer display used in experiments on condi-


tioning and causal learning by Dickinson, Shanks, and Evenden (1984). Trials
started with just a gunsight on the screen (A). A tank then appeared on the
right edge and began moving toward the left. If a shot was fired while the tank
was in the sight (B), a “hit” was registered, and the tank then exploded as it
passed through an invisible minefield on the left (C, D). The question: What
caused the tank to explode? (Courtesy of Anthony Dickinson.)

effectiveness (e.g., as inhibitory or protecting against explosions) exactly as


you would expect from your knowledge of conditioning. The research has
generated considerable interest in whether models of conditioning—espe-
cially the Rescorla-Wagner model—can handle categorization and causal
learning problems. One reason the Rescorla-Wagner model has attracted
particular attention is that the equation is essentially the same as one of the
equations that cognitive psychologists use to change the strength of the con-
nections between units (or nodes) in connectionist networks (often called
the “delta rule”; see Gluck & Bower, 1988; Sutton & Barto, 1981). As shown
230  Chapter 6

Figure 6.9  A connectionist model in which input Input units Output units
units (or nodes) are associated with output units.
In conditioning experiments, inputs and outputs
are CSs and USs, respectively. In category learn-
ing (as shown), they are features and categories.
In causal learning, they might be causes and
effects. (After Shanks, 1991.)

Categories
Features
Etc. Etc.

in Chapter 1, connectionist models can describe categorization. Although


the Rescorla-Wagner model does the job reasonably well, if people learn
to associate causes and effects (or features and categories) in the ways that
animals associate CSs and USs, the other models of classical conditioning
that we have examined probably also apply.
To illustrate, let us consider how David Shanks (1991) proposed to
handle his relative-validity result in the symptom-disease situation. He
proposed a simple connectionist network model (illustrated in Figure
6.9) that is quite similar to the connectionist model that we considered in
Chapter 1 (McClelland & Rumelhart, 1985). Here the nodes or units on the
left correspond to features (i.e., symptoms like puffy eyes, swollen glands,
and sore arms), and the nodes on the right correspond to categories (i.e.,
Dempes disorder, Phipp’s syndrome, etc.). Essentially, all the network does
David Shanks
is learn to associate the features with the categories following the Rescorla-
Wagner rule. On each trial, the nodes corresponding to the features that are
present are activated, as are the nodes corresponding to the categories that
are present. When features and categories occur together on a trial, the links
between them are increased a bit according to the Rescorla-Wagner equa-
tion. When a predicted category does not occur, the connection between
the feature and the category is weakened according to the same rule (it is
an extinction trial). Then—after learning has occurred—when the feature
units are activated by themselves, the corresponding category units are also
activated via the new connections, and in this way, the network “predicts”
that, say, puffy eyes and swollen glands mean Dempes disorder. The simi-
larity to modern conditioning models should be clear.
Because this kind of network is essentially a new presentation of the Re-
scorla-Wagner model, it predicts the same things that the Rescorla-Wagner
model predicts. Two features will compete with each other to gain connec-
Are the Laws of Conditioning General?   231

tion with a category; the network will produce blocking effects, overex-
pectation, overshadowing, and so forth. The network will learn inhibition,
too. If we intermix A+ and AB– trials, the connection between B and the US
takes on a negative value (e.g., Chapman, 1991; Chapman & Robbins, 1990;
see also Williams, 1995). Of course, the Rescorla-Wagner model’s weak-
nesses are also built into the network. For example, extinction will destroy
or weaken the connection between a feature and a category. This charac-
teristic of many network models is known as “catastrophic interference”;
that is, training the network something new in Phase 2 catastrophically
destroys what was learned in Phase 1. In Chapter 5, we discussed various
solutions to this problem that have been proposed for extinction in classi-
cal conditioning; perhaps they will also be brought to bear here someday.
This way of thinking about category learning in humans allows us to
see research on animal learning in a new light. Conditioning experiments
are designed to strip complex situations down to their essential details. For
a conditioning theorist, even the simple model proposed by Shanks (1991)
might be broken down into two simpler kinds of nets, which are super-
imposed (Figure 6.10). In one, multiple cues are connected with a single
outcome. As you might recognize, this kind of network, in which several
different cues can come to predict a US, is the kind of network that we have
routinely talked about. In a second kind of network, a single cue can be
connected with more than one output. In this case, we are concerned with
a CS that may be paired with more than one kind of US. As you probably
recognize, this sort of network has not been studied nearly as extensively.
Thinking about network models of categorization in this way might reveal
a gap in our understanding.

Input units Output units

CS1 US1 Figure 6.10  The model in Figure


6.9 actually contains even simpler
networks in which several inputs
CS2 US CS1 US2
are associated with a common
output and a common input is
CS3 US3 associated with several outputs.
232  Chapter 6

Figure 6.11  A connectionist model with hidden Hidden units


units between the inputs and outputs. (After
Rumelhart, Hinton, & McClelland, 1986.)
Input units Output units

It should be noted that the models shown in Figures 6.9 and 6.10 barely
scratch the surface of ideas about connectionist modeling (e.g., Rumelhart,
Hinton, & Williams, 1986). Many other kinds of models have been pro-
posed. For example, networks are sometimes expanded to include a layer
of hidden units, additional units that come between input and output
(e.g., Rumelhart, et al., 1986; Figure 6.11). Connections between these new
units and the inputs and outputs can be learned by using a modification
of the delta rule (the Rescorla-Wagner equation) known as the generalized
delta rule (Rumelhart et al., 1986). Models with hidden units have some
advantages, the main one being that they can learn problems in which the
combinations of inputs are informative, rather than single inputs alone.
We talked about configural conditioning in previous chapters; for example,
recall that animals and humans can learn negative patterning, in which
A+, B+, and AB– trials are intermixed. In general, we can learn that either
A or B alone signals the US or category, whereas A and B together in a
compound signals nothing. Negative patterning cannot be solved by the
Figure 6.9 network because A and B’s separate connections with the US
would always summate and activate the US node (because it is an elemental
model). Conceptually, the network with hidden units solves the problem
in the following way. A and B would both separately be associated with
the US, but a combination of the two would activate a hidden unit that
would acquire an inhibitory connection with the US. This inhibition would
cancel the activation provided by A and B. The main alternative for fixing
the simpler Figure 6.9 network would be to provide a novel “configural”
input for when A and B are combined (see Figure 4.17 and the surrounding
discussion in Chapter 4). This approach would need to supply a unique
Are the Laws of Conditioning General?   233

configural unit for every one of the billions of CS combinations that are out
there in the world (a very large potential number of input units that strikes
many theorists as awkward and implausible). An advantage of a network
with hidden units is that it spontaneously finds an existing hidden unit to
do the job; in principle, the same unit can do the same job for many other
CS combinations. It is not necessary to invent a new stimulus for every
new combination of stimuli.
Connectionist models like the one just described have largely been
developed in the area of human cognition. Their similarity to the animal
conditioning models we have already discussed suggests, however, that
the basic associative learning principles uncovered in experiments on con-
ditioning may have interesting applicability and generality. Connectionist
modeling principles have also given conditioning theorists some interest-
ing new tools to work with as they attempt to understand basic condi-
tioning and learning phenomena (e.g., Delamater, 2012; Gluck & Myers,
1993; Kehoe, 1988; Kutlu & Schmajuk, 2012; McLaren, Kaye, & Mackintosh,
1989; McLaren & Mackintosh, 2000, 2002; O’Reilly & Rudy, 2001; Schmajuk,
Lamoureux, & Holland, 1998).
Some disconnections between conditioning and human category
and causal learning
Although there are many correspondences between conditioning and
human learning, we have not quite covered the whole story. Indeed, there
are findings in human category and causal learning that seem inconsis-
tent with classical conditioning in animals. For example, when humans
are exposed to a series of learning trials, they may sometimes extract a
rule that describes the trial outcomes instead of merely learning simple
associations. Shanks and Darby (1998) gave humans a task in which dif-
ferent hypothetical foods were associated with an allergic reaction. Over
a mixed series of trials, the participants received both positive patterning
and negative patterning discriminations (e.g., A–, B–, AB+, C+, D+, CD–).
To solve the discriminations, the participants could associate the foods
(and their configurations) with the presence or absence of the allergic reac-
tion, as associative learning models would assume, or they could simply
learn the unstated rule that “a compound and its elements always predict
opposite outcomes.” To test this possibility, Shanks and Darby also gave
the participants E+, F+, and GH– trials intermixed with the ones just de-
scribed. What would happen when the participants were then tested with
EF and G and H for the first time? Although they had not been trained with
these stimuli, many of the participants rated the combined EF as a weaker
predictor than E or F and rated G and H as stronger predictors than the
combined GH; that is, they learned and applied the rule that “compounds
and elements are opposite.” This sort of result is not predicted by an asso-
ciative theory. Other results also suggest that humans may use inferential
reasoning in associative learning tasks (e.g., Beckers, DeHouwer, Pineño, &
Miller, 2005; Lovibond, Been, Mitchell, Bouton, & Frohardt, 2003; see also
234  Chapter 6

Figure 6.12  In forward blocking,


Phase 1 Phase 2 Test
A+ precedes AB+. In backward
blocking, A+ follows AB+. In either Forward blocking A+ AB+ B?
case, A+ training can reduce judg-
Backward blocking AB+ A+ B?
ments of B’s causal effectiveness.

Beckers, Miller, DeHouwer, & Urushihara, 2006, vs. Haselgrove et al., 2010,
for discussions regarding the idea that rats might use similar processes).
Thus, humans may do more than simply associate features with outcomes.
One of the most well-studied effects that seems unique to humans is a
phenomenon known as backward blocking (e.g., Chapman, 1991; Dick-
inson et al., 1984; Shanks, 1985; Williams, Sagness, & McPhee, 1994). The
usual blocking procedure involves the sequence of A+ trials and then AB+
trials (Figure 6.12). As you must know by heart now, animals and humans
do not learn much about B. The backward procedure reverses the order,
so AB+ occurs first, followed by A+. The Rescorla-Wagner model does not
predict blocking here. When B is paired with the US on the first trials, A has
no associative strength, and there is no way for it to block the conditioning
of B. Nonetheless, backward blocking does occur in humans (Figure 6.13).
Contrary to animal conditioning models, humans underrate the predictive
value of B when A+ trials either precede or follow the AB+ trials.
Backward blocking is not usually obtained in rats and pigeons, although
it does occur when no biologically significant USs are used in the two
crucial phases (Miller & Matute, 1996). (Human experiments differ from
the typical animal experiment in that they do not involve biologically sig-
nificant USs.) But when backward blocking does occur, how are we to ex-
plain it? There are two general approaches. One way is to expand existing
conditioning models so that they can handle the phenomenon. The other
approach is to assume that humans learn about cues and outcomes in a

55
Mean judgment of effectiveness

50

Figure 6.13  Backward blocking (see Figure 45


6.12) reduces the effectiveness rating of Cue B.
In the control condition, the participant had no
Phase 2 experience with Cue A. These data were
collected with the tank and minefield method 0
Control Backward
illustrated in Figure 6.8. (After Shanks, 1985.) blocking
Are the Laws of Conditioning General?   235

fundamentally different way. Of course, this fundamentally different way


might also then apply to nonhuman animals as well as it does to humans.
Maybe everything we know is wrong.
Here is how the new story might go. In any blocking experiment, we
might learn everything about A and B and then perform some mental
computation at the time of the test that allows us to know that B is a less
probable cause of the outcome than A. One influential approach is the
probabilistic contrast model developed by Patricia Cheng (e.g., Cheng
& Holyoak, 1995; Cheng & Novick, 1992; see also Cheng, 1997). The basic
idea is that we can learn and remember something like the probability of
an outcome given various cues. We might then judge the contingency be-
tween a cue and its outcome by comparing (or contrasting) the probability
of the outcome when the cue is present with its probability when the cue
is absent. If the outcome is more probable when the cue is present than
when the cue is absent, there is a positive contingency; when the outcome
is less probable when the cue is present, there is a negative contingency.
Animals and humans are clearly sensitive to such contingencies (e.g., Re-
scorla, 1967b; Wasserman, Elek, Chatlosh, & Baker, 1993). The language
might sound familiar because it is exactly how CS/US contingency was
presented in Chapter 3. The probabilistic contrast theory accepts this de-
scription of contingency as the way the organism actually computes and
understands it. In contrast, conditioning models like the Rescorla-Wagner
model work at a different level. Participants are not expected to store the
probabilities and calculate a contrast at the test. Instead, there are incre-
ments and decrements to associations with all the cues present as experi-
ence accumulates over trials.
To explain effects like blocking, probabilistic contrast theory needs to
take a further step. This is because in either forward or backward blocking,
there is a contingency between B and the outcome; the outcome is more
probable in the presence of B than in its absence! So, why does blocking to
B occur? The answer is that the contingency between a cue and an outcome
is actually further calculated within certain focal sets. If you look at the
experimental designs in Figure 6.12, you recognize immediately that B is
a poor predictor of the outcome because, given the presence of A already, it
predicts no further change in the probability of the outcome. The presence
of A is the focal set; within it, the contingency between B and the outcome
is zero. Thus, by emphasizing the role of focal sets, the approach explains
both forward and backward blocking, and it can be extended to handle
many other effects as well. In every case, the model assumes at the time of
testing that we choose the appropriate focal sets, contrast the appropriate
probabilities within them, and then make judgments based on the various
contrasts that we perform. (There can be many different focal sets and many
different contrasts in more complex experiments.) One problem has been
to predict exactly what focal sets will be used in each new situation that
comes along. Later versions of the theory attempted to make this prediction
less ambiguous (e.g., Cheng, 1997).
236  Chapter 6

One interesting feature of probabilistic contrast theory is that it ignores


the possible effects of trial order. The model predicts that blocking should
be equally strong in the backward and forward cases; either way, the contin-
gency between B and the outcome given A is zero. Unfortunately, that is not
quite right. Backward blocking is not as potent as forward blocking (Chap-
man, 1991; see also Dickinson & Burke, 1996). Other trial order effects also
clearly occur in human associative learning (e.g., Chapman, 1991; Shanks,
Lopez, Darby, & Dickinson, 1996). Probabilistic contrast theory ignores trial
sequence, which often matters, as conditioning models usually expect.
What, then, should we do? Conditioning models can also handle back-
ward blocking if they are expanded a bit. Van Hamme and Wasserman
(1994) suggested that one way to think of backward blocking is that the
subject is actually learning about the absent cue (B) when the other cue is
being treated alone in the second phase (A+). They showed that people do
change their ratings of the predictive values of cues that are absent. This
may occur because A and B were associated in the first compound stage
(Dickinson & Burke, 1996). Thus, every time A occurs, it retrieves a repre-
sentation of B, thereby allowing the trial outcome to influence B.
The trick is then to figure out why the associative strength of the absent
cue (B) is decremented every time the present cue (A) is paired with an
outcome and incremented. Van Hamme and Wasserman (1994) and Dickin-
son and Burke (1996) suggested different ways that this might be achieved.
Dickinson and Burke suggested an extension of Wagner’s SOP theory. Re-
call that, when a CS retrieves an associated stimulus, it puts the stimulus
into a secondarily active state (A2). Therefore, when A occurs in Phase 2
of the backward blocking experiment, the associated cue (B) is being put
into A2. In contrast, when the outcome is actually presented on these trials,
its own node is being put into the focally active A1 state. Dickinson and
Burke suggested that when two nodes are in different states at the same
time, an inhibitory link may be established between them; when they are
in the same state (either A1 or A2), an excitatory link may be established.
(This suggestion is consistent with the spirit of SOP, although the original
model only considered cases in which the CS was in A1.) The idea works:
Backward blocking will occur because B is in A2 at the same time that the
US is in A1. The resulting inhibition will weaken any connection learned
Anthony
between B and the US during the preceding compound-conditioning trials.
Dickinson
If your head is spinning a little now, that is okay, because the heads
of many experts are spinning, too. But these ideas about how condition-
ing models can be expanded to explain backward blocking are leading to
new discoveries. Dwyer, Mackintosh, and Boakes (1998) reasoned that the
expanded models imply that two completely absent cues can become as-
sociated if their associates are paired together. That is, if a CS and a US were
both retrieved into A2 at the same time, we should expect a new association
to form between them. In an experiment to test this prediction, rats re-
ceived a peppermint flavor in one context (Context 1), and almond-sucrose
Are the Laws of Conditioning General?   237

pairings in another. The rats presumably learned a Context 1–peppermint


association as well as an almond-sucrose association. In the next phase,
Context 1 and almond were paired. Remarkably, this treatment caused the
rats to associate the two absent cues (peppermint and sucrose), as indicated
by an increase in the rat’s preference for the peppermint flavor alone. (Re-
member that rats like flavors associated with sucrose.) This finding seems
plausible and reasonable enough, and in a way it is surprising that it was
not discovered until 1998. Its discovery, though, was a direct result of the
new theoretical ideas that were developed to account for phenomena like
backward blocking.
Causes, effects, and causal power
Although conditioning theories have surprising scope, there may be limita-
tions to what they can do. When humans learn about causes and effects,
they appreciate the causal relation in a way that seems to go beyond a
simple connection. When you observe an elephant crashing into a fence,
you understand that the elephant caused the fence to collapse. The elephant
and the collapsed fence are not merely associated; there is an understood
“causal power” that is transmitted from the cause to the effect. The notion
of causal power cannot be represented very easily in associative networks
(e.g., Cheng, 1997; Holyoak & Chang, 2011; Waldmann, 1996).
The idea has testable implications. Waldmann and Holyoak (1992) ar-
gued that our understanding of how causes operate can influence causal
judgments. For example, humans enter experiments already knowing that
two potential causes of an effect can compete with each other. For example,
if you notice that your grass is greener today than yesterday, you may
wonder whether it is because it rained yesterday or because someone ran
the sprinkler. These two causes provide conflicting explanations of the ef-
fect. They can compete, just as cues do in blocking, overshadowing, and
relative validity. On the other hand, two effects may not compete for their
connections with a cause. Rain may cause your grass to be both greener
and wetter. Being green and being wet do not compete for an association
with rain. (In fact, they tend to support each other: If the grass is green,
you might reasonably infer that it is also moist.)
Waldmann and Holyoak (1992; see also Waldmann, 2000) ran several
experiments in which people received associative learning trials with dif-
ferent cover stories. For example, different groups were asked to learn
about a set of buttons that could be illuminated to turn on an alarm (e.g.,
cause an alarm to go on) or merely indicate that the alarm was on (i.e., were
instead caused by the alarm). Both groups then received the same series of
trials in a blocking design: An initial set of trials established one button-
alarm association, and then a redundant button came on along with the
first and was also paired with the alarm. When the participants thought
that the buttons activated the alarm, there was competition between but-
tons, and blocking occurred, but when the participants thought that the
238  Chapter 6

Figure 6.14  Ratings of stimuli A and B when they were in a


forward blocking design (A+ trials followed by AB+ trials). Causes
Effects
When participants understood that A and B were causes of 10
another event (+), blocking occurred; there was competition
between causes. In contrast, when participants understood 8
that A and B were effects of another event (+), blocking did

Mean rating
not occur; there was no competition between effects. (After
Waldmann & Holyoak, 1992.) 6

0
A B
Stimuli

buttons merely indicated that the alarm was on, there was no blocking
(Figure 6.14). Thus, given identical learning trials, whether blocking oc-
curred or not depended on whether the lights were viewed as causes or
effects. Causes compete, but effects do not.
Waldmann and Holyoak’s (1992) interpretation has not gone unchal-
lenged (e.g., see Matute, Arcediano, & Miller, 1996), and several later experi-
ments produced competition (blocking or overshadowing) between “effects”
as well as “causes” (e.g., Arcediano, Matute, Escobar, & Miller, 2005; Baker,
Murphy, Mehta, & Baetu, 2005; Cobos, López, Caño, Almaraz, & Shanks,
2002; Tangen & Allan, 2003). This kind of result favors associative models.
Yet we all understand the difference between events that predict events ver-
sus events that cause other events. For example, smoke can predict fire, but
does smoke cause fire? The analysis of causal power is sophisticated, and it
can go some distance in explaining conditioning-like effects in humans (e.g.,
Cheng, 1997; but see, e.g., Baker et al., 2005). Does it handle the data better
than conditioning models? Associative models may have the upper hand in
accounting for trial order effects (but see Holyoak & Cheng, 2011); they also
sometimes provide a better fit of the data (e.g., Wasserman et al., 1993; see
also Shanks, 2010). On the other hand, it is difficult to see how a conditioning
model explains how a human can show blocking by just looking at a table
printed on a page (such as the one in Figure 6.12), and then predicting which
cue produced the outcome (cf. Wasserman, 1990).
The idea that our knowledge of cause and effect can influence how
we learn is part of a broader perspective that humans use propositions to
connect stimuli and outcomes in conditioning-like experiments. Over the
last several years, several authors (e.g., De Houwer, 2009, 2015; Mitchell,
De Houwer, & Lovibond, 2009) have argued that associative learning in
Are the Laws of Conditioning General?   239

humans (and possibly other animals) entails proposition learning; we


learn about propositional relations—descriptions of the world that can
be true or false (like “A causes B”)—instead of learning simple excitatory
or inhibitory associations between A and B. On this view, a human given
pairings of a bell and food is thought to learn, for example, “When I hear
the bell, I will receive food,” instead of merely associating the bell and food
(Mitchell et al., 2009). (Pavlov’s dog is also thought to learn some nonverbal
form of this.) Associative learning via proposition learning is supposed
to be effortful and attention-demanding. It is also thought to depend on
reasoning and awareness. Advocates of the idea (De Houwer, 2009, 2015;
Mitchell et al., 2009) have suggested that it can replace the view that as-
sociative learning involves merely associating mental representations.
One of the simplest results suggesting a role for propositions is that
conditioned responses can be created by merely giving a human participant
verbal information. For example, Cook and Harris (1937) long ago demon-
strated that telling humans that “the tone will be followed by shock” was
enough to get the participants to show an electrodermal response (sweat-
ing of the palms) to the tone (see also Lovibond, 2003; Sevenster, Beckers,
Kindt, 2012). (Skin conductance may be especially sensitive to this kind
of manipulation.) The fact that people can acquire knowledge this way, or
through merely studying the table in Figure 6.12, does not mean that they
never learn associations, however. In fact, most students of human asso-
ciative learning accept what is known as a dual-process view: Humans
can, of course, use rules or propositions to link associated items, but they
also often learn to link them through the conditioning of excitatory and
inhibitory associations (e.g., McLaren et al., 2014).
The idea that propositional and conditioning processes may be separate
is elegantly illustrated by a phenomenon known as the Perruchet effect
(Perruchet, 1985, 2015; see also McAndrew, Jones, McLaren, & McLaren,
2012; Weidemann, Tangen, Lovibond, & Mitchell, 2009). In the original
experiment (Perruchet, 1985), Pierre Perruchet had humans participate
in an eyeblink conditioning experiment in which a brief tone was paired
with a puff of air to the eye on half the trials. Perruchet told his partici-
pants ahead of time that the tone and puff would be paired on a random
50% of the trials. Then, while the participants underwent conditioning,
he monitored conditioned responding to the tone (eyeblinks) as well as
the participants’ expectancy that a puff would occur on the next trial. The
random series of CS-puff and CS-alone trials yielded runs of reinforced and
nonreinforced trials of different lengths. As you would expect from what
you know about conditioning, eyeblinks elicited by the tone tracked the
tone’s trial-to-trial associative strength: The CR increased as the number
of immediately preceding CS-puff trials increased, and it decreased as the
number of preceding CS-alone (extinction) trials increased (Figure 6.15A).
Remarkably, however, expectancy ratings went the opposite direction: The
expectancy of a puff on the next trial decreased as the number of preceding
240  Chapter 6

(A) (B)
55 6.0

50 5.5

Expectancy ratings (scale 0–7)


Conditioned eyeblinks (%)

45
5.0
40
Eyeblink data 4.5 Expectancy data
35
4.0
30
3.5
25

20 3.0

15 2.5
4 3 2 1 1 2 3 4 4 3 2 1 1 2 3 4
CS alone CS-US pairs CS alone CS-US pairs
Length and type of preceding run

Figure 6.15  The Perruchet effect. In a conditioning experiment, participants were


told that half the CS presentations would be paired with an air puff and half would
not. The random 50% schedule yielded runs of CS-alone and CS-US pairings of
different lengths. (A) Conditioned eyeblink responses increased as the number of
previous CS-US pairings increased (and decreased as the number of CS-alone trials
increased). (B) In contrast, the participants’ expectancy that an air puff would occur
on the next trial showed exactly the opposite pattern, because the ratings were
vulnerable to the “gambler’s fallacy.” (After Perruchet, 1985, 2015.)

CS-puff pairings increased, and it increased as the number of preceding


CS-alone trials increased (see Figure 6.15B). Expectancy ratings were ap-
parently vulnerable to the so-called gambler’s fallacy: Given a 50/50 chance
of winning, people reason (fallaciously!) that after a run of reinforcers, no
reinforcer is bound to occur next time, and after a run of no reinforcers, a
reinforcer is bound to occur next time. The fact that the CR and expectancy
ratings went in opposite directions suggests that one cannot explain the
other. Thus, expectancies generated by “reasoning” cannot account for
the strength of the eyeblink response and vice versa. We are left with the
conclusion that a conditioning process that controlled the eyeblink CR and
a (fallible) propositional process that controlled verbal expectancy ratings
were both operating, separately, in Perruchet’s experiment.
Psychologists have a tendency to assume that only one approach to a
problem can be correct and that the best model will win and destroy all
the other ones. In fact, as Baker, Murphy, and Vallee-Tourangeau (1996)
argued, many mechanisms may easily contribute to a particular behavioral
result. Thus, both associative and proposition learning can contribute to
behavior studied in category and causal reasoning experiments in people.
This conclusion is actually the same one we reached when we were dis-
Are the Laws of Conditioning General?   241

cussing honeybees. Many processes contribute to one effect—whether in


humans, rats, or insects—but nonetheless it can definitely be said that as-
sociative learning models based on conditioning principles apply surpris-
ingly widely. The attempt to apply them to new problems has strengthened
and expanded their scope.
Conclusion
This chapter has covered many trees, although the forest has been consis-
tent throughout. There have been good reasons to ask whether the learning
principles discovered in the animal learning lab really generalize to the
world outside. At this point, the answer is yes, they do apply. Condition-
ing principles have helped us understand a wide variety of phenomena.
In addition to the ones considered in this chapter, I hope you have not
forgotten the many useful applications of conditioning principles to drug
abuse, anxiety disorders, and so forth that have been in the background
of our discussion all along. Another point is that raising and investigating
the question of generality has strengthened the theories by spurring ex-
perimenters to do research that has expanded and improved them. We saw
this in taste aversion learning; more recently, we saw it in the application
of conditioning models to human associative learning. There is a kind of
ratcheting effect of studying the basics—and then seeing how far they can
go—that puts conditioning processes in better perspective and is good for
everyone. There will probably always be some limits or differences, but
even different classical conditioning preparations, like eyeblink condition-
ing, fear conditioning, and taste aversion learning, have things that make
them different from one another. My preference is to see the differences as
things that fine-tune a general associative process to their different func-
tions. Appreciation of causal power may be a new example.
Where does this issue go from here? Perhaps it will always be in the
background. Research investigating correspondences between condition-
ing and associative learning in humans is particularly active right now. In
Chapter 8, we will see the principles in action again as we try to work out
how organisms learn the spatial layout of the environment. There are signs
that space may be the next (although, undoubtedly, not the final) frontier.
This chapter has actually focused on an issue raised in Chapter 1, where
we saw a fundamental difference between how philosophers think that
knowledge and the mind are supposed to work. David Hume and the
associationists argued that associations were everything; that we never
know causality directly, but that we infer it from sense impressions; and
that the mind is a blank slate—a tabula rasa—at first. Immanuel Kant,
in contrast, argued that the mind has a priori assumptions with which it
molds experience. This chapter has considered several Kantian themes. The
animal’s apparent inborn tendency to associate certain kinds of events in
taste aversion experiments is one of them; the cue-to-consequence effect is
a major argument against the classic tabula rasa. Causal power (whether
we know it directly or infer it from association) is another Kantian idea.
242  Chapter 6

The tension between empiricistic and rationalistic views will undoubtedly


motivate research in the psychology of learning for many years to come.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. By the early 1970s, research had uncovered several surprising proper-
ties of taste aversion learning. Writers began to question the generality
of the laws of learning that had been discovered up to that point.
2. Many of the unusual properties of aversion learning were eventually also
found in other examples of classical conditioning. These properties began
to be explained by an expanding set of general learning principles.
3. Research on taste aversion learning produced many insights into the gener-
al aspects of learning. We now understand that learning serves a biological
function and that not all cues are equally associated with various conse-
quences. Function and evolution have a powerful influence on learning.
4. General learning processes may evolve because of exaptation: the
concept that a learning mechanism adapted for solving one particular
problem may work well enough at handling other problems. True adaptive
specialization may require that a mechanism adapted to handle one prob-
lem be functionally incompatible with the solution of another problem.
5. At a descriptive level, conditioning principles often seem surprisingly gen-
eral across species and conditioning preparations. For example, the relative
validity effect—originally demonstrated in rats, rabbits, and pigeons—has
now been demonstrated in human categorization and causal judgment. It
has also been shown in classical conditioning in honeybees. The honeybee
result is interesting because the bee’s brain evolved independently of the
brains of rats, rabbits, pigeons, and humans.
6. There are many correspondences between conditioning in honeybees
and vertebrates, but there are also possible differences. Behavioral out-
comes that look similar at a descriptive level may sometimes result from
different processes.
7. There are also many correspondences between classical conditioning in
animals and categorization and causal learning in humans. Categorization
and causal learning may obey conditioning principles like the ones in the
Rescorla-Wagner model. Another idea is that they involve other processes,
such as probabilistic contrast and the perception of causal power.
8. Research on associative learning in humans has uncovered new ef-
fects. One is backward blocking. Although backward blocking was not
predicted by models of classical conditioning, those models are being
extended to account for it.
Are the Laws of Conditioning General?   243

9. Some investigators have argued that conditioning involves learning


propositions about the relationships between events, like CSs and USs,
in the world. The Perruchet effect, however, suggests that processes
that control conditioned responses and verbal reports can be dissoci-
ated and can operate separately.
10. In the long run, research that has examined the scope and generality of
learning laws developed in the animal learning laboratory has helped us
develop better general laws.

Discussion Questions
1. Why was taste aversion learning thought to be such a special form of
learning? List the phenomena that suggested that it might follow unusual
learning laws. Then use research on them to explain why many scientists
now think that research on taste aversion learning mainly led to better
general learning laws instead of the discovery of new specialized laws.
2. Use your knowledge of concepts like adaptation, exaptation, and func-
tional incompatibility to outline conditions that might lead to the evolu-
tion of specialized versus general learning mechanisms.
3. Use concepts that you know from classical conditioning to explain how
honeybees learn to approach flowers with different colors and odors
when they forage for nectar. What evidence supports the idea that con-
ditioning in honeybees follows rules like the ones we know from experi-
ments with rats, pigeons, and humans? Explain why learning in the hon-
eybee is so interesting to compare with learning in these other species.
4. How can we use classical conditioning principles to understand cat-
egory learning and causal learning in humans? What is the CS, and what
is the US? What are the strengths and weaknesses of this approach? Are
principles of classical conditioning equipped to explain everything that
humans learn or infer when they participate in category learning, causal
learning, and classical conditioning experiments?

Key Terms
analogous  221 hedonic shift  213 proposition
backward blocking  234 hidden units  232 learning  239
category learning  228 homologous  221 stimulus relevance  210
causal learning  228 interference  210 successive negative
compound long-delay contrast  227
potentiation  217 learning  206 taste-reactivity
dual-process view  239 modules  222 test  213
exaptation  220 Perruchet effect  239 within-compound
association  218
focal sets  235 probabilistic contrast
model  235
Chapter Outline
Basic Tools and Issues  246 Behavioral economics: Are reinforcers
Reinforcement versus contiguity all alike?  272
theory 246 Theories of Reinforcement  276
Flexibility, purpose, and motivation  249
Drive reduction  276
Operant psychology  252
The Premack principle  277
Conditioned reinforcement  254
Problems with the Premack
The Relationship Between principle 280
Behavior and Payoff  257 Behavioral regulation theory  282
Selection by consequences  284
Different ways to schedule payoff  257
Choice 260 Summary 288
Choice is everywhere  264
Impulsiveness and self-control  266 Discussion Questions  289
Nudging better choices  271
Key Terms  291
chapter

1
7
Behavior and Its
Consequences

T his chapter turns an important corner in our book.


We have just concluded a rather extended discussion
of Pavlovian conditioning (S-O learning), which is im-
portant both as a phenomenon in its own right and
as a powerful method for investigating how we learn
about stimuli in our environment. We now begin a
discussion of what I called “response learning” in the
earliest chapters: how we learn about behaviors and
their consequences (R-O learning). This type of learn-
ing is widely known as instrumental learning or oper-
ant conditioning, and the classic example (of course)
is the rat pressing a lever in the Skinner box. To the
uninitiated, the rat in the Skinner box seems as trivial
as Pavlov’s original drooling dog, but once again we
are dealing with something that is important both as
a general phenomenon and as a method. Without re-
sponse learning, we would be pitifully poor at adapt-
ing our behavior to the environment. The Skinner box
actually provides a method for studying how this kind
of learning operates. As we saw in Chapter 1, operant
behavior is lawfully related to its consequences, and
understanding the relationship between behavior and
its consequences is a big part of what the study of
response learning is all about.
Operant conditioning is usually discussed as if it
were completely independent of classical condition-
ing, and we will stick with that tradition in this chapter.
There is something important from Chapters 1 and 2
that you should remember right from the start, how-
246  Chapter 7

ever: Response learning and stimulus learning almost always go hand in


hand. That is, whenever a behavior and a reinforcer are paired, there are
always cues in the background that are also associated with the reinforcer.
(Similarly, whenever a CS leads to a US, there are also behaviors that may
be associated with the US.) This chapter will consider operant learning on
its own, but the interconnectedness of S-O and R-O learning will become
an issue in the chapters that follow. Chapter 8 will expand on the idea that
operant behavior always occurs in the presence of stimuli that guide and
set the occasion for it. Some of these stimuli (like categories and time and
space) introduce a rather interesting cognitive dimension to the discus-
sion. Chapter 9 will then consider the idea that reinforcers motivate oper-
ant behavior instead of merely stamping it in. S-O learning turns out to
be critically involved in how reinforcers motivate. Finally, in Chapter 10,
we will describe the modern “synthetic view” of response learning that
combines behavioral and cognitive approaches, S-O and R-O learning,
and also ethological ideas and themes. This discussion will provide an
opportunity to summarize and integrate some of the book’s main ideas.
In the meantime, the best place to start a discussion of response learning
is by considering some of the early attempts to understand it. The early
thinkers identified many of the basic issues, and they also developed many
of the major methods and conceptual tools. What is most fascinating about
the early thinkers, though, is how remarkably different their ideas were.
From the very start, there were many ways to think about response learn-
ing; to be honest, there still are.

Basic Tools and Issues


Reinforcement versus contiguity theory
We discussed Edward Thorndike in Chapter 1. Thorndike’s ideas had a
huge effect on all the thinking that followed afterwards. He emphasized
the importance of reinforcement in instrumental learning. You remember
that Thorndike’s early experiments were concerned with cats learning to
manipulate latches to get out of puzzle boxes and get food. Thorndike
ran the experiments because he was interested in understanding the cat’s
intelligence, and he came away with the idea that the cat merely associated
the stimulus and response in a gradual way. All that seemed to be required
to explain the cat’s learning was the idea that the reward strengthened an
S-R association between the situation and the correct response. Thorndike
eventually saw this as a very general principle of learning.
Thorndike’s theory of learning was a reinforcement theory. By this we
mean that he emphasized the reinforcing consequences of behavior, which
he thought were necessary for learning to occur. He thought that reinforcers
created a satisfying state of affairs that “stamped in” the S-R association. He
gave us the law of effect—the idea that positive consequences of behavior
strengthen the behavior, whereas negative consequences weaken it. Thanks
in part to theorists who built on Thorndike’s ideas (most notably Clark Hull
Behavior and Its Consequences   247

and B. F. Skinner), the law of effect is so widely known and influential today
that most psychology students assume that no other emphasis is possible.
In fact, however, there were other views. Edwin R. Guthrie (1935), another
early theorist, had a radically different idea. Guthrie did not believe that
reinforcement was necessary for learning. Instead, he believed that learning
occurred whenever an S and an R merely occurred together in time. Simple
temporal contiguity was all that was required to learn an S-R association.
Because of the centrality of this idea, Guthrie’s theory is known as a contigu-
ity theory (as opposed to a reinforcement theory). Another unusual aspect
of Guthrie’s theory was that he assumed that learning took only one trial.
Edwin Guthrie
Today, Guthrie’s contiguity approach strikes many people as rather
strange. How could he possibly ignore the obvious importance of rein-
forcement? It would be very easy to show that a rat that is rewarded for
running in a runway would run faster than a rat that receives no reward.
And how could Guthrie deny the obvious gradualness of most learning
curves? He did not deny these facts; instead, he offered a different way of
explaining them. Consider the gradualness of the typical learning curve.
Although he assumed that an S-R association was learned with just one
trial, he defined the stimulus differently. He argued that any “stimulus,”
like a runway or a puzzle box, is actually made up of a very large number
of stimulus elements. It is stimulus elements that get connected with the
response on any trial. Note, though, that the rat cannot possibly notice all
the elements on a single trial; therefore, on Trial 1, he only connects the
response to the elements that he has actually noticed or “sampled” (to use
a more modern term). On Trial 2, the rat notices some of those elements,
but not all, and he also notices some new (yet-to-be-connected) elements.
The response is only a little bit bigger than the Trial 1 response. After Trial
2, however, the new elements are connected, and so on. After many trials,
the rat is more likely to notice a large number of elements that have been
connected. The general idea, illustrated in Figure 7.1, explains why the
learning curve is so gradual.

Trial 1 Trial 2 Trial 3. . . . . . Trial 16

Figure 7.1  Learning as Guthrie saw it. On each trial, a small set of stimulus
elements (in the circle) are sampled and get connected with the response
(blue dots). As associated continue, more and more elements are associated
with the response. These associated elements are more and more likely to be
sampled on subsequent conditioning trials.
248  Chapter 7

Guthrie’s idea of stimulus elements is still around in psychological


theory. In the 1950s and 1960s, William K. Estes developed a sophisticated
mathematical theory (stimulus sampling theory) that extended the idea
(e.g., Estes, 1950, 1955, 1959). The idea is now widely incorporated in mod-
els of human memory (e.g., Healy, Kosslyn, & Shiffrin, 1992; Mensink &
Raijmakers, 1988) and in recent accounts of discrimination learning (see
Chapter 8). Connectionist views like the ones we considered in previous
chapters, with their emphasis on the stimuli composed of many to-be-
associated features, also accept a stimulus-elemental approach.
But what about the other problem? How did Guthrie explain the obvious
effect of reinforcers on behavior? Here, his thinking was again subtle and
original. Guthrie said that a reinforcer did not stamp in the S-R connection.
Instead, it is merely a very salient stimulus, so its presentation completely
changes the situation. Thus, on any learning trial, presentation of the rein-
forcer will change the background stimulation so much that the response
that was just associated with the stimulus, like running in a runway, will be
preserved. (In an animal that receives no reward, a new random response
that might occur after running in the runway, like grooming or standing,
could become associated with the runway instead.) By Guthrie’s account,
rewarded rats respond faster than nonrewarded rats because competing
responses would get associated with a changed, or different, situation.
Guthrie’s account of the effects of reward did not stimulate a large
amount of research. However, Guthrie and Horton (1946) did publish some
interesting results that followed directly on Thorndike’s cat experiments.
The results showed the plausibility of the contiguity approach even in
Thorndike’s puzzle box situation. Cats in Guthrie and Horton’s experi-
ment had to learn to move a vertical pole to get out of a puzzle box. They
did so in a manner that was very stereotyped from trial to trial. Many cats
learned to move the pole by rubbing up against it with their head, arched
back, and then tail. This flank-rubbing response occurred repeatedly on
trial after trial, and if it ever changed, it changed suddenly (and not gradu-
ally) to a different behavior. The fixity of the response from trial to trial
was consistent with Guthrie’s theory. Unfortunately, though, the rubbing
response turns out to be a natural response any cat performs in courtship
or “greeting” rituals. Guthrie and Horton (along with as many as eight
other people) watched their cats while sitting in plain sight of the cat in
the experimental room. It turns out the cats were merely rubbing the pole
as a way of greeting the observers. Moore and Stuttard (1979) repeated the
experiment and found that the flank rubbing occurred when an experi-
menter was in the room but not when the experimenter was absent. Thus,
the stereotyped response was not really produced by one-trial S-R learning
but was a natural behavior elicited by stimuli present in the situation. We
will return to this sort of theme when we examine instrumental learning
from an ethological perspective in Chapter 10.
Some of Guthrie’s ideas seem quaint by modern standards. On the other
hand, many of his ideas—in particular, stimulus elements—have endured.
Behavior and Its Consequences   249

The point here is not necessarily to convince you to accept Guthrie’s theory;
rather, it is to point out that there have always been alternate ways of view-
ing even the simplest reinforcement situation. In effect, there is more than
one way to skin Thorndike’s cat.
Flexibility, purpose, and motivation
Another view of instrumental learning was promoted by Edward Tolman,
the theorist whose innovation of intervening variables (and “operational
behaviorism”) was discussed in Chapter 1. Like Guthrie, Tolman was a
little bit out of the mainstream. (In retrospect, this mainly means that his
work did not descend directly from Thorndike’s.) His ideas were highly
influential, however, and are perhaps more influential than ever today.
Tolman’s main point was that it was not necessary to be mechanistic
about behavior to be scientific about it. Instead of believing that animals
learn rigid associations between a stimulus and a response, he argued that
behavior is inherently flexible. It is a means to an end—a variable way to
get to a goal. Tolman was skilled at demonstrating his ideas with simple
experiments run in collaboration with his students. For example, in a clas-
sic study by Macfarlane (1930), rats learned to swim in a maze to receive a
reward. After they learned the problem of which way to swim to reach the
reward, MacFarlane drained the water from the maze and then returned
the rats to the start location. What did the rats do? They ran, rather than
swam, to the goal location. Behavior is fundamentally goal-oriented; it is
a flexible means of achieving a goal.
If behavior is so flexible, what exactly is learned? Tolman (1948) sug-
gested that the rats learned something like a map (a “cognitive map”) of
the environment. This idea will be considered further in Chapter 8. For
now, the main point is that learning may not be primarily about behavior.
Tolman thought that animals learn about stimuli (something like S-S as-
sociations) rather than connect a stimulus with a response.
Once again, Tolman managed to illustrate his points with clever experi-
ments. One of the most famous was run by Tolman, Ritchie, and Kalish
(1946a), who ran rats on an “elevated” maze (Figure 7.2A), meaning that
it was off the floor with an open view of the surrounding room. The rats
began at the start location, ran across a tabletop, and then, after several
turns, ran down a straightaway to find food at the goal. After the rats
learned this setup, Tolman et al. (1946a) removed the goal and replaced the
maze with a fan of arms (Figure 7.2B). When tested in this setup, the rats
mostly ran down arm number 6—that is, directly to where the goal had
been during training. Thus, behavior was again flexible and goal-oriented.
But whether the rats had truly learned a mental map of the original maze is
not at all clear. A lightbulb happened to be hanging near the goal location;
during training, the rats could have associated the food with a particular
location near the light and mainly gone in that direction during the test.
The same authors published another experiment during the same year
(Tolman, Ritchie, & Kalish, 1946b). In this one, they ran two groups of rats
250  Chapter 7

(A) Train (B) Test


F G 11 10 9
8
H 12 7

Block
6
H
D
5
E
4
13
14
15 3
C
16 2
17
18 1

B
A

Figure 7.2  Bird’s-eye view of the apparatus used by Tolman, Ritchie, and
Kalisch (1946a). (A) Training. The rat had to cross a table (the circle) and then
make a left turn and two right turns to get to the food that was located at the
goal location (G). “H” shows the position of a light. (B) Testing with the goal and
original runways removed. Most rats chose the arm that led to the original goal
location. (After Tolman, 1948.)

on the elevated “plus maze” illustrated in Fig-


ure 7.3. There were two start locations (S1 and
S2 S2), two goal locations (G1 and G2), and two
groups of rats. Both groups started from S1 on
half the trials and from S2 on the other half.
One group was reinforced for performing a
consistent response: Regardless of where the
rats started from, turning right (for example)
led to food. Notice that this meant going to a
G1 G2 different place from trial to trial. The second
group was reinforced for going to a consistent
place: Regardless of the start point, G1 (for ex-
ample) contained food. Notice that this group
had to use a different response (turning left
sometimes and turning right other times) to
get to the goal. Which group learned better?
S1 The answer was clear: The rats that ran to a
Figure 7.3  Maze used by Tolman, Ritchie, consistent place learned better. This result
and Kalisch (1946b) in their experiment on suggests that it is easier for rats to learn about
place and response learning. There were places than about responses.
two start locations (S1 and S2) and two goal A final experiment by Tolman and Honzik
locations (G1 and G2). (1930) summarizes Tolman’s perspective very
Behavior and Its Consequences   251

(A) (B)
12

10
Door
8

Average errors
NR
6

NR-R
4
R
Curtain
2

Start Food 0
box 2 4 6 8 10 12 14 16 18
Trials

Figure 7.4  The latent learning experiment (Tolman & Honzik, 1930). (A) The
14-unit T-maze. (B) The results, which suggested that reinforcement is not
necessary for learning, although it was necessary to motivate the animal to
translate what it had learned into performance. The arrow indicates the trial
when the NR-R rats began to be rewarded. (After Tolman & Honzik, 1930.)

well. They used a complicated T-maze with 14 choice points (Figure 7.4A);
a turn at one choice point led to another choice point, and so forth. One
group of rats (R) was rewarded consistently when they reached the final
goal location. As the figure shows, the number of errors these rats made
(the blind alleys they entered) decreased slowly over trials. Another group
(NR) was never rewarded. The experimental group (NR-R) received the
same exposures to the maze, but they were not rewarded on the first 10
trials. (On nonrewarded trials, the experimenter simply removed the rat
from the maze when it reached the goal location.) However, on Trial 11
and on each trial thereafter, these rats received the same reward as the first
group. As Figure 7.4B shows, these rats switched and got through the maze
rather accurately after the first reward. Their change in performance over
trials was much quicker than the group that had been rewarded all along.
Loosely speaking, they had been learning about the maze all along, and
rewarding them beginning on Trial 11 gave them a reason to get through
the maze more efficiently.
The experiment suggests several potentially important conclusions
about instrumental learning, which summarize Tolman’s perspective very
nicely. First, to switch so abruptly after the first reward, the animals must
have been learning about the maze on the early nonreinforced trials; there-
fore, learning had been occurring without reinforcement. Second, whatever
learning was occurring was not manifest in behavior until after a reward
was provided. Learning is not the same as performance; the experiment
demonstrates the so-called learning/performance distinction. (In fact,
252  Chapter 7

the learning was said to be “latent,” and this experiment is often known
as the latent learning experiment.) Third, although the reinforcer was
not necessary for learning, it clearly had a powerful effect on behavior.
But instead of stamping in the response, it motivated the rat by giving it
a reason to get through the maze. Rewards are essential for translating
learning—or knowledge—into performance. This very important idea was
quickly adopted by other theorists and experimenters, including Clark Hull
and Kenneth Spence. That part of Tolman’s story will be told in Chapter 9,
which focuses on the motivational aspects of learning.
It is instructive to see that so many different perspectives on instrumen-
tal learning are possible, and it is worth pausing to summarize the views
of this set of early theorists. Thorndike and Guthrie viewed learning as the
acquisition of S-R associations, whereas Tolman viewed learning as some-
thing more like the acquisition of S-S associations. Thorndike assumed that
reinforcement was necessary for learning, whereas Guthrie and Tolman did
not. Both Guthrie and Tolman nonetheless gave rewards a role: For Guthrie,
rewards changed the situation; for Tolman, rewards provided a crucial
source of motivation. Rewards can thus have at least three possible func-
tions: They might reinforce (Thorndike), they might function more or less
as just another stimulus (Guthrie), and they might motivate performance
(Tolman). It is worth keeping this in mind as we continue our discussion.
Operant psychology
Another person with an early perspective on instrumental learning was B.
F. Skinner. Skinner’s early writings were published around the same time
as Guthrie’s and Tolman’s (in the 1930s), and that is why Skinner can be
considered an early theorist. Because he was a younger man and he lived
longer, his influence actually became strongest in the 1950s, 1960s, and
beyond. As we saw in Chapter 1, Skinner’s approach was different in that
it was deliberately “atheoretical”; he actually never set out to explain in-
strumental learning in the same way that Thorndike, Guthrie, and Tolman
did. His methods and style of analysis have nonetheless been extremely
influential, and they help identify several crucial concepts and tools for
understanding response learning. Most of the research reviewed in the rest
of this chapter was stimulated by Skinner’s approach.
In Skinner’s original operant experiment (see Chapter 1), the rat was
allowed to press a lever repeatedly in a Skinner box to earn food pellets.
Beginning in the 1940s, pigeons were similarly allowed to peck illuminated
discs or “keys” on the wall to earn grain. You will remember that, in either
case, the response is considered an operant because it is a behavior that is
controlled by its consequences (the reinforcer). Conversely, the food pellet
is a reinforcer because it is a consequence that controls an increase in the
operant behavior that produces it. One of the great insights of Skinner’s
method and analysis is the recurrent nature of the operant response. The
animal is able to make the response as often as it wants to; in this sense, it
is different from responding in the puzzle box or the maze. The behavior
Behavior and Its Consequences   253

thus appears voluntary, and the method allows us to investigate how so-
called voluntary behavior is related to payoff.
Operants are often learned through shaping, a process we considered
in Chapter 2. In shaping, experimenters reinforce closer and closer approxi-
mations of the desired response. Skinner actually thought that the effects of
reinforcers are so powerful that they can be quite accidental and automatic.
In a classic experiment (Skinner, 1948), he put pigeons in Skinner boxes and
gave them food at brief and regular intervals, regardless of what they were
doing. Even though there was no causal relation between the birds’ behav-
ior and getting food, each bird came to behave as if a causal relation was
involved. One bird learned to turn counterclockwise repeatedly, another bird
learned to thrust its head into a corner of the box, and other birds learned to
rock their heads back and forth. Skinner called these behaviors superstitious
behaviors. Each presumably developed because of accidental pairings with
a reward. For example, at one point in time, a bird might happen to rock
a little just before a reinforcer occurred. The response would be repeated,
making it more likely that it would happen before the next reinforcer, and
so on. Superstitious behavior is fairly common in human experience. Many
bowlers twist and contort their bodies after they let go of the ball, as if they
can steer the rolling ball toward making a strike. Baseball batters knock
their helmets, cross their hearts, bang the bat on their cleats, etc., as they get
ready before every pitch. Presumably, these behaviors were also accidentally
associated with a reinforcer, like getting a strike or spare in bowling or mak-
ing a hit or getting a home run in baseball. Superstitious behavior suggests
that reinforcers can have surprisingly arbitrary power. We will consider the
topic again in Chapter 10.
In truth, operant behavior is even more in-
teresting and complex than this. For one thing,
Skinner was aware that operants are not emit-
Light on
ted in a vacuum; they occur in the presence
of stimuli that set the occasion for them. He
illustrated this concept in some of his earliest
Responses

work with rats pressing levers (e.g., Skinner,


1938). For example, Skinner arranged things
so that a rat was reinforced for lever pressing
whenever a light was turned on. During peri-
Light off
ods when the light was turned off, however,
lever presses were not reinforced. Not surpris-
ingly, the rat detected these different contin- Trials
gencies and began to confine its lever pressing
Figure 7.5  Simple demonstration of stimu-
to periods when the light was on (Figure 7.5).
lus control. If a lever-press response yields
This result is a simple example of discrimina- a reinforcer when a light is on but not when
tion learning. In Skinner’s terms, the operant the light is off, the organism learns to re-
response was brought under stimulus control: spond in the presence of the light. The light
Its occurrence now depended on a stimulus is a discriminative stimulus (or SD) that sets
that set the occasion for it. the occasion for the response.
254  Chapter 7

Stimulus control is an extremely fundamental concept, one we will con-


sider a number of interesting forms of in Chapter 8. For now, just note that
stimulus control is operating almost everywhere. For example, the behaviors
reinforced at a fraternity party are very different from the ones reinforced in
a classroom, and most of us learn to behave accordingly. Children probably
also learn to name things in a similar way. Saying “apple” is reinforced in the
presence an apple, but “banana” is reinforced in the presence of a banana.
Naming an object can thus be seen as an operant behavior that is under
stimulus control. Stimulus control is truly happening all the time.
With hard-to-discriminate stimuli, the acquisition of stimulus control
can be facilitated by a procedure known as fading. The idea is to gradually
transfer stimulus control from easy stimuli to harder stimuli by presenting
both together and then fading the easy stimuli out. For example, pigeons
have trouble learning to discriminate vertical and horizontal lines projected
on the key, but it is easy for them to learn to discriminate between the colors
red and green. To train a horizontal-vertical discrimination, pigeons can first
be taught to peck at the color red but not green. Then, vertical and horizon-
tal lines can be superimposed on the red and green colors, which can then
be removed gradually (e.g., Terrace, 1963), leaving only the horizontal and
vertical lines visible. There are many real-world examples of this concept.
For example, the director of a play can encourage an actor to speak his or
her lines at appropriate times by first prompting the actor and then gradu-
ally fading out the prompts. In this case, other aspects of the situation (e.g.,
another actor’s lines) begin to set the occasion for the first actor’s response.
Ultimately, stimulus control illustrates another way in which operant
behavior is nicely adjusted to its environment. According to Skinner, the
stimuli that initiate the response are not thought to elicit it; otherwise, that
would make them respondents. Instead, the stimuli “set the occasion”
for the response, and at a theoretical level, this occasion-setting role is
similar to the Pavlovian occasion-setting function discussed in Chapter 5.
For now, the important point is that the operant is controlled by both the
stimulus that sets the occasion for it and by the reinforcer that reinforces
it. A stimulus that works in this fashion—by setting the occasion for, rather
than eliciting a response—is known as a discriminative stimulus, or SD.
The stimulus that sets the occasion for not responding (the light-off situa-
tion in the rat experiment) is called an SΔ, or “S delta.”
Conditioned reinforcement
We can take the simple stimulus control experiment a step further. Suppose
that we have a rat that is responding at a good rate in SD but very little in
SΔ. We can now introduce a new contingency: If the rat responds during
SΔ, we can turn SD on and let the animal respond for reinforcers again. This
new contingency (if lever press, turn the light back on) does something
rather interesting. The rat begins responding more during SΔ—that is, the
rate of responding increases when we make the light a consequence of the
response. In this sense, the SD satisfies Skinner’s definition of a reinforcer.
Behavior and Its Consequences   255

Unlike the food, which has intrinsic reinforcing properties, SD’s reinforcing
properties are conditional on being in our experiment. It is therefore called
a conditioned reinforcer (or secondary reinforcer). Stimuli or events
that reinforce because of their intrinsic properties—like food, water, and
sex—are considered primary reinforcers.
Conditioned reinforcers acquire their value from their Pavlovian rela-
tionship with the primary reinforcer (e.g., Williams, 1994a; see also Fantino,
1969, 1977). The concept of conditioned reinforcement is a very important
one in learning theory. For one thing, it is often considered crucial for gen-
eralizing principles developed in laboratory research with food rewards to
the real world of human behavior. Only a small part of human behavior is
directly reinforced by a primary reinforcer like food, water, or sex. Instead,
it is often controlled by conditioned reinforcers that operate because of their
association with a primary reward. Money is the most famous example.
People will do many things (emit many behaviors) to get money, but it has
no value except that it signals or enables other things. Therefore, money is
a classic example of a conditioned reinforcer.
Conditioned reinforcement is useful in understanding behavior in many
settings. It is often involved even when we are studying primary rewards.
When students shape a rat to press a lever in a Skinner box, they first give
the rat some experience with the delivery of food pellets. While the rat is
wandering around the box, the students deliver food pellets from time
to time. Typically, the food pellets are delivered by an electrical gadget (a
“feeder”) that makes an audible “click” when it operates. Not surprisingly,
the rat quickly learns to approach the food cup at the sound of the “click”
to retrieve the food. That is, the “click” becomes an SD—it sets the occa-
sion for approaching the food cup. It also becomes a Pavlovian CS for food
(these functions are probably learned together). The student next reinforces
the rat for successive approximations of lever pressing. As the animal is
reinforced for moving near the lever, say, the most immediate consequence
of the response is not the food pellet itself, but the click. In this manner, the
student winds up exploiting the click’s power as a conditioned reinforcer.
Similar conditioned reinforcers are used in animal training outside the
lab (e.g., see Pryor & Ramirez, 2014, for some history and overview). A
well-known example is “clicker training,” commonly used with dogs, in
which the sound of a toy clicker is first paired with a food treat and then
used to reinforce behavior. Recall that a delayed reinforcer is typically
not as effective as an immediate one (see Chapter 2). One important role
of conditioned reinforcers is to facilitate learning when there is a delay
between the response and the opportunity to deliver a primary reinforcer.
The conditioned reinforcement concept also helps us understand behav-
ior at a more fine-grained level of analysis. For example, the rat in the Skin-
ner box is doing more than merely pressing the lever. After some training,
it reliably performs a whole sequence of behaviors, or a behavior chain.
The rat might approach the lever, press it, move to the food cup, consume
the pellet, and then approach the lever and repeat the sequence again. The
256  Chapter 7

Stimulus: Sight of lever Feel lever “Click” Food pellet

Response: Approach lever Lever press Approach cup Turn

Figure 7.6  A behavior chain. Each response in the chain produces a stimulus
that serves as a discriminative stimulus that (1) reinforces the previous response
and (2) sets the occasion for the next response.

behaviors in the chain are glued together by discriminative stimuli, which


are present at each step (Figure 7.6). These discriminative stimuli both re-
inforce the preceding behavior as well as set the occasion for the behavior
that follows. Pressing the lever brings the “click,” which reinforces it and
then sets the occasion for approaching the food cup. Similarly, approaching
the lever is reinforced by contact with the lever, which then sets the occasion
for pressing it, and so on. Behavior chains are extremely common in the
real world. For example, a smoker must buy a pack of cigarettes, open the
pack, extract a cigarette, and light it before he or she can inhale the smoke
and be reinforced by nicotine. Similar chains are involved in other forms of
drug taking and overeating (e.g., Olmstead, Lafond, Everitt, & Dickinson,
2001; Thrailkill & Bouton, 2015). In all cases, the dual function of the SD—to
set the occasion for the next response and reinforce the preceding one—is
thought to be the factor that binds complex sequences of behavior together.
Although conditioned reinforcement seems crucial in a detailed under-
standing of behavior, the concept has been somewhat controversial (e.g.,
Bolles, 1967; Rachlin, 1976; Staddon, 1983). One reason is that there is very
little evidence that conditioned reinforcers cause the learning of brand-new
behaviors when these reinforcers are presented without any primary rein-
forcement. Instead, researchers have usually emphasized that conditioned
reinforcers boost performance that is otherwise maintained at a low level
by primary reinforcement. For example, I already mentioned that condi-
tioned reinforcers help boost responding when primary reinforcement is
delayed (e.g., Spence, 1947; Williams, 1994a). In another type of experiment,
Kelleher (1966) had pigeons peck a key to obtain food that was available
only once every 60 minutes. Food was accompanied by a brief stimulus (a
0.7-second presentation of a white key). Not surprisingly, the rather sparse
delivery of food did not cause the pigeons to respond very much. But when
the pigeons also received the brief white key stimulus (without food) for
the first peck emitted at 2-minute intervals, they started responding more.
Thus, presenting the stimulus associated with food enhanced the low level
of responding that was otherwise observed. After reviewing a large number
of related studies, Williams (1994a) concluded that stimuli associated with
primary reinforcers can, in fact, strengthen behavior because these stimuli
acquire their own reinforcing value. An alternative idea is that they provide
Behavior and Its Consequences   257

a kind of feedback that tells the organism it is getting closer to primary


reward (e.g., Bolles, 1975; Shahan, 2010).

The Relationship Between Behavior and Payoff


Different ways to schedule payoff
The concept of conditioned reinforcement comes up in part when one con-
siders behavior outside the laboratory. Most operant responses in the real
world are not reinforced every time they occur, yet the behavior keeps
on going anyway. This fact was amply appreciated by Skinner and the
behavior analysts who followed him, who have therefore investigated the
effects of scheduling rewards intermittently. Research on schedules of
reinforcement has led to several fundamental insights about how operant
behavior relates to its payoff.
When a behavior is reinforced every time it occurs, we say that it is
reinforced on a continuous reinforcement (CRF) schedule. Not surpris-
ingly, this method of scheduling reinforcers generates a steady rate of
responding. Skinner developed a simple and elegant way to study the
effects of reinforcers on operant behavior. He seemed to love gadgets,
and in his early work, he developed a clever way of portraying the rate
of responding in a graphic manner (see Skinner, 1956, for the interesting
story of how his love of gadgets led him to many of his methods). He
mounted a piece of paper on a rotating drum so that it moved at a slow
and steady rate (Figure 7.7A). A pen sat on the paper, tracing a line as

(A) Cumulative recorder (B) Cumulative record


Cumulative lever presses

Time

Figure 7.7  (A) Cumulative recorder. As the animal responds in the operant cham-
ber, each response deflects a pen slightly upward on a piece of paper mounted
to a drum that is moving at a constant slow rate. (B) The result is a graph that
plots the cumulative number of responses over time, an elegant way to visualize
the rate of an individual’s operant behavior.
258  Chapter 7

the paper moved. Every time the rat pressed the lever, the pen was de-
flected upward a bit. At the end of 30 minutes or so, what could be seen
on the paper was a trace of what the rat had done, in which the cumula-
tive number of lever presses was shown on the y-axis over time (x-axis).
The graph in Figure 7.7B is called a cumulative record, and the drum
recorder is called a cumulative recorder.
Figure 7.7B shows the cumulative record of a rat that performed for
several minutes. One of the virtues of the cumulative record is that the rate
of behavior is readily appreciated by the slope of the line. It is immediately
apparent that the rat pressed the lever at a steady rate.
A little thought reveals that reinforcers can be scheduled intermittently
in at least two fundamental ways. First, a reinforcer can be delivered after
a certain number of responses. Because this sort of schedule ensures a ratio
between work (responding) and reward (reinforcement), it is called a ratio
schedule. In a fixed ratio (FR) schedule, reward is scheduled after every
Xth response. On an FR 2 schedule, an animal is reinforced every second
response, whereas on an FR 150 schedule, it is reinforced after every 150th
response. As reinforcers become fewer and farther between (the schedule is
said to become “leaner”), animals begin pausing after each reinforcer. These
“postreinforcer pauses” are evident in the cumulative record shown in the
upper left cell of Figure 7.8. (The downward diagonal deflections of the pen
in the graph indicate reinforcer deliveries.) Interestingly, it is not difficult to
get a fair amount of behavior out of animals on these schedules. For example,
under conditions that will be described in Chapter 9, rats learn to respond on
an FR 5120 schedule fairly easily (Collier, Hirsch, & Hamlin, 1972).
Another variation on the ratio idea is the
variable ratio (VR) schedule. In this case, as
Fixed ratio Variable ratio
before, there is always a certain relationship
between the number of responses and the
number of reinforcers, but now the number
of responses required to earn each individual
reinforcer varies. On a VR 4 schedule, the av-
Responses

erage number of responses required to earn


a reinforcer is 4, but the first pellet might be
Fixed interval Variable interval delivered after 2 responses, the second after
6, and so forth. Variable ratio schedules can
generate very high and steady rates of behav-
ior. (Overall, there is less post-reinforcement
pausing.) The world’s greatest example of a
VR schedule is the slot machine. These de-
vices deliver coins on a VR schedule, and the
Time consequences are obvious: a steady stream
Figure 7.8  Cumulative records showing of lever-pulling and income for the casino. A
typical responding on different schedules of cumulative record from a VR schedule is il-
reinforcement. (After Williams, 1988.) lustrated in the upper right cell in Figure 7.8.
Behavior and Its Consequences   259

The second, and perhaps less obvious, way of scheduling intermittent


rewards is with interval schedules. In this case, reinforcers are delivered
for the first response made after a certain interval of time has elapsed. On
a fixed interval (FI) schedule, that interval remains constant from one
reinforcer to the next. For example, on an FI 1-minute schedule, the first
response emitted after 1 minute has elapsed is reinforced; the timer is reset,
and the first response emitted after the next minute is reinforced; and so on.
One thing that makes these schedules interesting is that behavior begins
to reflect the timing of the reinforcers. As shown in the lower left cell of
Figure 7.8, the animal tends to respond only a little after each reward, but
the rate of responding picks up as the next scheduled reinforcer becomes
imminent. This characteristic pattern of behavior on FI schedules is known
as scalloping. In effect, the animal learns to time the interval, and elapsed
time becomes a kind of SD for responding. This feature of behavior on FI
schedules has been used to investigate timing processes in animals, as we
will see in Chapter 8.
As you have probably guessed, interval schedules can also be variable.
In variable interval (VI) schedules, there is an average interval after which
the first response is reinforced, but each reinforcer is set up after differ-
ent intervals of time. On a VI 1-minute schedule, the first pellet may be
earned after 30 seconds has elapsed, the second pellet may be earned after
90 seconds, and so forth. Because time becomes less relevant in predicting
reinforcer availability, VI schedules do not produce as much scalloping as
FI schedules; instead, they tend to generate very steady rates of behavior
(see Figure 7.8, lower right cell). One example of behavior on a VI sched-
ule is my checking of my mailbox in the Psychology Department where I
work. Mail, flyers, and university memos appear in that box at irregular
intervals, and I find myself checking my mail at a low but fairly steady
rate throughout the day. I do the same with e-mail.
Interval schedules have an interesting property that is not obvious until
you think about it: The rate of responding can vary widely without affect-
ing reinforcement rate. In ratio schedules, there is always a relationship be-
tween behavior rate and reinforcer rate. The faster the organism responds,
the more rewards he or she will obtain. But this relationship is not true in
interval schedules. For example, an FI 1-minute schedule will produce a
maximum reinforcement rate of 60 reinforcers per hour. Once the organism
hits this maximum, say, with 20 responses per minute, there will be no fur-
ther increase in reinforcer rate with further increases in response rate. The
“molar feedback functions” of interval and ratio schedules—which are the
possible relationships between the rate of responding and the resulting rate
of reinforcement—are very different. Perhaps because of this difference,
response rates on ratio schedules tend to be faster than response rates on
interval schedules when reinforcer rates are equated (Catania, Matthews,
Silverman, & Yohalem, 1977). Alternatively, because interval schedules de-
pend on the crucial interval timing out for reinforcement to be delivered,
260  Chapter 7

interval schedules are more likely than ratio schedules to reinforce waits
between responses. Ratio schedules might also tend to reinforce rapid re-
sponse bursts (e.g., Skinner, 1938).
The four basic types of reinforcement schedules just described are just
a few of the types of schedules that are possible. In a so-called compound
schedule, two or more schedules operate. One example is a multiple
schedule in which two or more schedules alternate, with each individual
schedule being signaled by its own SD. (When there are no corresponding
SDs, alternating schedules create a “mixed” schedule.) In a chained sched-
ule, completion of the response requirement for one schedule leads to the
SD of the next component. (In this case, when there are no corresponding
SDs, we have a “tandem” schedule.) For additional types of reinforcement
schedules, see Catania (1998) or even Schedules of Reinforcement by Ferster
and Skinner (1957), a book that is remarkable in the sense that it is almost
nothing but cumulative records produced by rats and pigeons reinforced
on almost every conceivable schedule.
Choice
Another fundamental kind of schedule is a concurrent schedule. In such
a schedule, animals are given the opportunity to engage in two or more
responses that are each reinforced according to separate schedules that
are running at the same time. For example, on a concurrent VI VI sched-
ule, a pigeon may peck either of two lighted keys—each of which pays
off according to its own variable interval schedule—which operate in-
dependently. The bird may switch between keys whenever it wants to,
but a reinforcer that happens to be waiting to be delivered for the first
peck on the new key is usually not presented for several seconds after
the changeover has occurred. Concurrent schedules of reinforcement are
especially interesting because they involve choice. In the real world, we
are always choosing among any number of alternative behaviors that each
has its own separate payoff. By manipulating the reinforcement schedule
on two alternatives, we can begin to discover the laws that govern how
we choose what to do.
Richard Herrnstein (1961) showed that behavior on concurrent sched-
ules is remarkably lawful. In his experiment, pigeons were allowed to
respond on several pairs of VI schedules. A given pair of VI schedules
remained in effect for many days (about a month); Herrnstein was inter-
ested in performance generated by the schedules after the performance had
stabilized. In the final sessions on each pair of VI schedules, he counted
the number of pecks at each key and also the number of reinforcers earned
for pecks at each key. Overall, a very regular relationship was obtained. As
Figure 7.9 shows, the percentage of pecks at any one key always equaled
the percentage of reinforcers that were earned there. We can express this
relationship in very simple terms. We will call the number of pecks to the
two alternatives B1 and B2 (for Behaviors 1 and 2) and the number of rein-
Behavior and Its Consequences   261

(A) (B)

1.0 1.0 Work


0.9 0.9 Time
0.8 0.8
Proportion behavior

0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0
Proportion reward Proportion reward

Schedule 1 Schedule 2 Schedule control Schedule 1 or 2

Figure 7.9  Matching on concurrent VI VI schedules. (A) Pecks to one key are
reinforced on one VI schedule, and pecks to another key are reinforced on
another VI schedule. When the proportion of responses on one alternative is
plotted as a function of the proportion of rewards earned there, one observes
a diagonal line; the two proportions are equal (i.e., they match). (B) The same
results are obtained when the subject can switch between schedules presented
on one key by pecking another key. (After Herrnstein, 1971.)

forcers earned on each alternative R1 and R2. If the proportion of pecks on


B1 equals the proportion of reinforcers that were earned there, then

B1/(B1 + B2) = R1/(R1 + R2) (7.1)

which merely says (again) that the two proportions are equal.
Equation 7.1 is known as the matching law, a fundamental rule that de-
scribes choice in concurrent schedules of reinforcement (Herrnstein, 1970).
It is important to recognize that Herrnstein’s (1961) results, and the match-
262  Chapter 7

ing law that describes them, are far from trivial. With two VI schedules, the
birds could have earned a given number of reinforcers (R1 or R2) with any
of a very large number of possible numbers of pecks (B1 and B2). Thus, the
relationship described in Equation 7.1 is not inevitable; instead, it is rather
subtle and interesting. With a little algebra, the equation can be written as

B1/B2 = R1/R2 (7.2)

and this equality necessarily also holds true.


The relationship seems to hold up under a surprisingly wide range of
species and methods; for example, the law also works in humans. Conger
and Killeen (1974) had college students converse on the subject of drug
abuse with two experimenters, who were seated at opposite ends of a long
table. By saying things like “that’s a good point,” etc., the experimenters
reinforced comments made by the participants. Unbeknown to the partici-
pants, the experimenters delivered these verbal reinforcers according to
different VI schedules. After 15 minutes, the participants were directing
their conversation in accordance with the reinforcers. In fact, as Figure
7.10 shows, the proportion of time the students spent talking to each of the
experimenters matched the proportion of verbal reinforcers delivered by
the experimenters. Verbal exchanges follow rules of operant conditioning,
and in this case, they conformed nicely to the matching law.
The law has been studied in animals using other procedures. One
slightly different procedure was invented by Findley (1958; see Figure
7.9B). In this case, pigeons choose to peck on one of two VI schedules that
are available on a single key. One schedule is signaled by the key being one
color, and the other schedule is signaled
by the key being another color. The bird
1.0
can switch between schedules by pecking
0.9 a second key, the switching key. In this
0.8 sort of method, very similar results are
obtained, but in this case, experimenters
Proportion responding

0.7
typically measure the time the bird spends
0.6 in the two schedules (T1 and T2). Here, the
0.5 same relationship obtains:
0.4
T1/T2 = R1/R2 (7.3)
0.3
0.2 Still another indication of the law’s
0.1 generality is that the relationship still ap-
0
plies when experimenters have manipu-
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 lated the magnitudes of the rewards and
Proportion of reinforcements
totaled the amount of the two reinforcers
Figure 7.10  Matching in humans. Different earned (Catania, 1963) or varied the delay
symbols are results with different participants. in delivery of the reward after the response
(After Conger & Killeen, 1974.) (Chung & Herrnstein, 1967). (Delay effects
Behavior and Its Consequences   263

can be seen as immediacy, or the reciprocal of delay.) According to Williams


(1988, 1994b), all these parameters of reward can be put together (see also
Davison & McCarthy, 1988; Grace, 1994) in a way that relates to a single
variable—the reinforcer’s “value.” Thus, the most general statement of
matching is

B1/B2 = V1/V2 (7.4)

where V1 and V2 are the value of each reward as determined by its number,
size, and immediacy. It is a remarkably general law.
Although a great deal of research attests to the matching law’s general-
ity (for reviews, see Davison & McCarthy, 1988; McDowell, 2013; Nevin,
1998; Williams, 1988, 1994b), deviations from perfect matching are common.
Sometimes animals “overmatch,” and B1/B2 is consistently a little greater
than R1/R2. Other times, animals “undermatch,” and B1/B2 is a little less
than R1/R2. Baum (1974) proposed a more generalized form of the matching
equation, which includes additional bias and sensitivity terms; this form
has the power to describe cases of overmatching and undermatching (see
McDowell, 2005, 2013, for further extensions).
Although the matching law describes the relationship between payoff
and choice rather well, it says nothing about the psychological processes
that lead the animal to behave in such a manner. In this sense, it is beauti-
fully true to Skinner’s interest in a purely empirical description of how op-
erant behavior relates to its consequences. Several explanations of matching
have nonetheless been proposed. Animals might choose to respond on one
alternative or the other on a moment-to-moment basis so as to maximize
the momentary rate of reinforcement (e.g., Shimp, 1966). Or, perhaps the
animals choose in a way that somehow maximizes the overall rate of re-
inforcement in the whole session (e.g., Rachlin, Green, Kagel, & Battalio,
1976). Still another possibility is that animals might keep shifting between
the alternatives so as to always improve the local rate of reinforcement. In
this case, they would stop shifting between the alternatives (and behavior
would stabilize) when the two local rates of reinforcement are equal (e.g.,
Herrnstein & Vaughan, 1980). This process, called melioration, can produce
matching, and it has other interesting implications for behavior (e.g., Her-
rnstein & Prelec, 1992). There are results that are difficult for each of these
explanations of matching to handle (Williams, 1994b). The matching law
itself appears to work, however, and it provides additional insights that
we will consider in the next section.
Although the matching law tells us a lot about what determines the
relative rate of operant behavior, it would be a mistake to think that a
behavior’s rate is the same as its strength. Tony Nevin and his colleagues
(e.g., Nevin, 1998; Nevin & Grace, 2000) have argued that operant behavior
also has a property that is analogous to “momentum”: Just as a moving car
is difficult to stop, a behavior can be resistant to change. Momentum is not
just a function of an object’s speed; it also depends on its mass. In a similar
264  Chapter 7

sense, a behavior’s resistance to change is not necessarily predicted by be-


havior rate, or (hence) the matching equation (e.g., Nevin, Tota, Torquato,
& Shull, 1990). Resistance to change also depends on other factors, such
as the extent to which reward is associated with stimuli in the background
(e.g., Craig, Nevin, & Odum, 2014).
Choice is everywhere
Although choice is an obvious aspect of concurrent schedules, it is also
involved in even the simplest operant experiment. That is, even when the
rat in the Skinner box is given only a single lever to press, it chooses to
press the lever over a host of other alternative behaviors—it could instead
scratch an itch, sniff around in a corner, or curl up and fall asleep. The same
is true of any single operant that you or anyone else might perform; there
are always a large number of available alternatives. Thus, I am currently
writing at my computer instead of playing solitaire, getting some coffee,
Richard
or taking a nap. You are similarly reading this book right now instead of
Herrnstein
doing any of a number of other things, each of these other behaviors pre-
sumably having its own payoff. Choice is actually everywhere. In real life,
concurrent schedules are probably the rule.
Given this idea, we can state the matching law in its most general form.
Even when we focus on a single operant behavior (e.g., B1) with its own
reinforcement rate (e.g., R1), we are always given a choice of performing
the behavior or of performing other behavior. Let’s call the other behavior
BO; it has its own reinforcement rate, RO. The idea is that the matching law
still applies. Thus,

B1/(B1 + BO) = R1/(R1 + R O ) (7.5)

However, the quantity B1 + BO is all the behavior possible in a given situa-


tion, and the total cannot vary from time to time; therefore, we can consider
it a constant, K. So, substituting,

B1/K = R1/(R1 + R O )

and, with a little additional algebra (cross-multiplying by K), we can re-


write the equation as

B1 = K × R1/(R1 + R O ) (7.6)

This final equation, which describes the rate of behavior to one alternative
in any given operant situation, is called the quantitative law of effect
(Herrnstein, 1970).
There are several points to make about this simple equation. First, it
describes how behavior rate (B1) varies as a function of reward rate (R1) in
single-operant situations. Figure 7.11 shows some classic results reported
by Catania and Reynolds (1968), wherein different pigeons were trained on
Behavior and Its Consequences   265

120
100
80
60
115, 8.5 100, 300
40
20
0
50 100 150 200 250 300 0 50 100 150 200 250 300
Responses/minute

80

40
70, 4 75, 2

0
50 100 150 200 250 300 0 50 100 150 200 250 300

80

40
69, 11 68, 5

0
50 100 150 200 250 300 0 50 100 150 200 250 300
Reinforcements/hour

Figure 7.11  Response rates of six pigeons, each one tested with several differ-
ent VI schedules (each of which paid off at different reinforcements/hour). The
results of each bird are consistent with the quantitative law of effect. Response
rate (y axes) is B1 and reinforcement rate (x axes) is R1. Numbers in each panel
are the K and RO values for each bird. (Data from Catania & Reynolds, 1968;
after Herrnstein, 1971.)

a series of VI schedules. After behavior had stabilized on each schedule, the


number of responses (B1) and number of reinforcers (R1) were counted, and
another data point was added to each bird’s graph. As Figure 7.11 shows,
the behavior rate increased as a function of reward rate, which is not terribly
surprising. In general, though, the shape of the function is highly consistent
with Equation 7.6. Each bird’s behavior differed a little from the other birds’
behavior; however, in each case, the functions are nicely captured by Equa-
tion 7.6, with K and RO varying among birds. In effect, the birds for whom
increases in reinforcer rate had less effect on B1 had higher rates of RO. Ca-
sually speaking, the quantitative law of effect implies that these birds were
more distracted by the high rate of alternative reinforcement that they were
idiosyncratically deriving from the environment. There is a lawfulness to
the behavior across birds, but there is room for individual differences, too.
Let us take a step back and consider the broader implications of the idea
that B1 is influenced by both its own reinforcement (R1) and reinforcement
for alternatives (R O ). We might want to weaken an undesirable behavior in
a person—say, taking dangerous drugs. One obvious technique would be
266  Chapter 7

to reduce R1, perhaps by reducing the availability of the drug. This idea is
a familiar one: Any operant perspective predicts that reducing a behavior’s
rate of reinforcement will reduce the behavior. A less obvious alternative,
however, would be to increase RO—that is, increase the reinforcement de-
livered for other behaviors that are available. In the laboratory, this has
been accomplished by presenting extra “free” reinforcers that were not
contingent on the target response (e.g., Rachlin & Baum, 1972). The present
interpretation is that by increasing RO, we are weakening B1. According to
the quantitative law of effect, operants can be weakened by increasing the
reinforcement earned for alternative behaviors.
There is also a variation of this idea. Suppose that you have a friend
or an adolescent son or daughter who is beginning to experiment with
drugs or alcohol. When is experimentation most likely to turn into a
problem? Note again that drugs and alcohol are reinforcers; in fact, drug
problems can be thought of as situations in which drug taking (an oper-
ant behavior, B1) becomes very high. According to the quantitative law of
effect, B1 is most likely to become high for individuals who derive little
other reinforcement from their environment (i.e., those for whom RO is
low). A drug will thus become especially addictive (generating a high
rate of responding) when there is little other reinforcement around. We
have already seen that B1 can be weakened by increasing RO. But now
you can also see that problems in principle are preventable by building
environments in which positive, prosocial behaviors are available and
adequately reinforced. To me, that may be the most important implication
of the quantitative law of effect.
Impulsiveness and self-control
While we are on such a righteous theme, another line of research is also
relevant. Choice has also been studied when animals and humans are al-
lowed to choose between a large, delayed reward and a smaller reward
that is more immediate. This seems analogous to many decisions we make
in real life. We may decide to go to a movie (an immediate, but rather
modest reward) instead of studying for the GREs (with its delayed, but
potentially larger reward). Or we may decide to eat a tasty, high-calorie
dessert (an immediate, small reward) instead of getting up and taking
a walk (which would have a delayed but positive effect on health). In a
similar way, a smoker may decide to smoke a cigarette (a small, immediate
reward) instead of abstaining and later having better health (a delayed, but
large reward). In each case, if we delay our gratification and choose the
delayed (larger) reward, we are said to have exercised “self-control.” On
the other hand, if we choose the more immediate (smaller) reward, we are
showing “impulsiveness.” There is a large literature of research devoted
to investigating the choices made by animals and humans in these kinds
of situations (e.g., see Logue 1988, 1995, 1998, for reviews).
One key finding is that the behavior of animals and humans is often
quite impulsive—they often choose immediate (although smaller) re-
Behavior and Its Consequences   267

wards over delayed (although larger) rewards, even though it seems


rather maladaptive in the long run to do so. To understand this, we can
return to a fact we first encountered in Chapter 2: Delayed reinforcers
are less reinforcing than immediate ones. To put it somewhat differently,
delayed rewards have less value to us than immediate ones. For example,
most people would undoubtedly prefer $100 today instead of $100 next
month or next year. To study this kind of preference in more detail, we
can give people a series of choices pitting $100 next month, say, against
different dollar amounts today (e.g., $20, $40, $60, $80). In this way, we can
discover an amount of money given immediately that is equal in value
to the $100 delayed by a month—people might be indifferent to getting
$80 today versus $100 next month. We can then find similar “indifference
points” for $100 delayed by, say, 6, 12, or 36 months. Figure 7.12 shows
the results when undergraduates have been tested this way for hypotheti-
cal money, beer, candy, and soda (Estle, Green, Myerson, & Holt, 2007).
As you can see, the value of each reward falls off systematically as it is
delayed (see Mazur, 1997, for one mathematical function). The reward’s
value is said to be “discounted” by delay; the fact that value decreases as
a function of delay is called delay discounting. Interestingly, the value of
beer, candy, and soda all decreased at roughly equal rates (the functions
are equally steep). Money’s value decreased, too, but not as quickly (be-
cause money always has some value!). Smaller amounts of each reward
were also discounted a bit more steeply than larger amounts.

(A) Small amount (B) Large amount


1.0 1.0
Money
Beer
0.8 Candy 0.8
Soda
Subjective value

Subjective value

0.6 0.6

0.4 0.4

0.2 0.2

0 0
6 12 18 24 30 36 6 12 18 24 30 36
Months Months

Figure 7.12  Delay discounting: The subjective value of different rewards


(money, beer, candy, and soda) to human participants as a function of their
delay. (Data from Estle, Green, Myerson, & Holt, 2007.)
268  Chapter 7

Figure 7.13  The Ainslie-Rachlin rule. The


Large
choice between a small, immediate reward
versus a larger, delayed reward is governed by
which one is more valuable to you at a particu-

Reinforcer value
lar time. At Time 2, a smaller, more immediate
reward has more value than a larger reward Small
that will occur after a longer delay. You there-
fore choose the smaller reward. In contrast, at
Time 1, the larger reward has more value to
you than the smaller reward, despite the same
difference in delay.
Time 1 Time 2

Time

George Ainslie (1975) and Howard Rachlin (1974) used delay discount-
ing to explain self-control and impulsiveness in situations in which we
choose between small, immediate rewards and large, delayed rewards.
The idea, again, is that choice is determined by the relative “value” of
the two rewards, with the value of each being determined by both their
size and their delay. The idea is presented in Figure 7.13. As the figure
illustrates, larger reinforcers have a higher value than smaller reinforc-
ers, and as we just discussed, their value decreases as a function of their
Howard Rachlin
delay (the curves here are like those in Figure 7.13, merely flipped left
to right). There is one crucial thing, however: At any given time, the
individual should choose the more valuable reward—that is, the reward
whose value is currently higher. If we offer the person the choice at Time
2, when both rewards will happen soon, the value of the more immediate
reward is higher than the delayed reward. He or she therefore behaves
impulsively and chooses the immediate reward. Notice, though, that if
we offer the choice at Time 1, when the rewards are scheduled to happen
more remotely in time, the relation between their values has reversed.
Because of the way that reward value decreases over the delay, the larger
reward now has greater value than the smaller reward. The person might
now demonstrate self-control and choose the more delayed reward. The
different functions predict that what is called “time to reward” will affect
impulsive behavior and self-control.
This prediction is consistent with empirical research. For example,
Green, Fisher, Perlow, and Sherman (1981) had pigeons choose between
pecking a red key for a small reward and a green key for a larger reward.
The larger reward was always delayed 4 seconds longer than the smaller
reward. In one condition, the larger reward was presented after a delay
of 6 seconds, and the smaller reward was presented after only 2 seconds.
The birds’ behavior was impulsive—they consistently chose the smaller,
more immediate reward. But, in another condition, both rewards were
presented after longer and longer delays. As before, the smaller reward
always occurred 4 seconds sooner than the bigger reward. As the delays
Behavior and Its Consequences   269

100 Figure 7.14  Choice for a


Choice for delayed reward (%) larger, delayed reward over
80 a smaller, immediate reward.
Self-control increases as the
time between the choice and
60 actual presentation of the re-
ward increases. (After Green
40 et al., 1981.)

20

0
5 10 15 20 25
Seconds before end of trial

increased, however, the birds began demonstrating self-control and began


to choose the bigger, more delayed reward. The results are presented in
Au: OK to add “(%)” to y-axis label?
Figure 7.14. When the average choice-to-reward delay was relatively
short, the birds chose the smaller, more immediate reward; when the
choice-to-reward delay was relatively long, they chose the larger, more
delayed reward.
One of the nice things about the Ainslie-Rachlin approach (see Figure
7.13) is that it immediately suggests ways to exercise “impulse control.” I
have just illustrated one of them: If you can arrange to make the choice
long before the rewards are actually due, you should be less tempted by
the pesky little immediate reward. So-called precommitment strategies
take advantage of this. The idea is to commit oneself early in such a way
that an opportunity to make an impulsive choice later will be prevented.
The classic example is using an alarm clock to get up in the morning and
get to that early class. Early in the morning, just before the class, you have
a choice of remaining in bed and getting a little extra sleep (a small, im-
mediate reward) or getting up and going to class and getting intellectual
stimulation or a better grade (a larger, delayed reward). Some of us might
choose to snooze a little more and miss the class. But if you set your alarm
clock the night before, you precommit to waking up and getting the larger
reward by effectively precluding the response of continued sleep; it is much
easier to make this choice earlier. Interestingly, pigeons given an analogous
precommitment option (i.e., the choice to peck on a key that will produce
an early commitment to a larger reward and preclude another choice later)
will sometimes use it (Ainslie, 1975; Rachlin & Green, 1972), although not
all pigeons do. Precommitment is a practical way to defeat impulsiveness
and practice self-control.
The approach also suggests other ways to encourage impulse control.
For example, choosing the bigger delayed reward will be easier if the
value of the tempting immediate reward were lowered or if the value of
the delayed reward were increased— either would make the value of the
270  Chapter 7

delayed reward consistently higher than the immediate reward in Figure


7.13. Therefore, a smoker would be less likely to smoke if he or she had
to pay money each time he or she smoked (decreasing the value of the
immediate cigarette) or got paid for not smoking (increasing the value
of choosing better health). “Incentive-based” and “contingency manage-
ment” treatments of smoking and other behaviors, which I will discuss
more below, use exactly this kind of thinking. In animals, other evidence
suggests that giving the subject experience with delayed rewards can
increase the choice of large delayed rewards when self-control is tested
later (e.g., Stein, Renda, Hinnenkamp, & Madden, 2015). Experience with
the delayed rewards may habituate the individual to the aversiveness of
waiting. Perhaps similarly, humans discount reward less steeply over
time if they are first asked to think about positive events in life that they
are looking forward to (e.g., Daniel, Stanton, & Epstein, 2013). Delayed
rewards are not necessarily bad.
There are yet other factors that influence impulsiveness and self-control.
Grosch and Neuringer (1981) studied choice in a situation in which pigeons
received a less preferred (but immediate) type of grain if they pecked a key
and a highly preferred (but delayed) type of grain if they did not peck the
key. Grosch and Neuringer showed that “impulsive” choices for the imme-
diate reward were more likely if the two rewards were visible throughout
the trials. This effect was reduced if the birds could peck a key on a rear
wall that moved their attention away from the visible grains. Interestingly,
when Grosch and Neuringer presented Pavlovian CSs that predicted food,
impulsive choices also increased. The results paralleled earlier findings with
children reported by Walter Mischel and his associates (e.g., Mischel, Shoda,
& Rodriguez, 1989). Mischel suggested that “hot thoughts”—ideas about
how nice a reward will taste and so forth—can often defeat attempts at self-
control, whereas distracting thoughts and behaviors (such as playing with a
toy or thinking “cool” thoughts) can often help efforts at self-control. Hot and
cold thoughts are related to issues connected with the motivating effects of
rewards (and their signals) that will be discussed in Chapter 9. Finally, at least
one study followed up on evidence that steep discounting may be correlated
with poor working memory (Shamosh et al., 2008): Drug abusers who were
given extensive practice with working memory tasks (like repeating digits
or words that were recently presented to them) decreased the rate at which
they discounted rewards (Bickel, Yi, Landes, Hill, & Baxter, 2011). Working
memory training thus made them appear less impulsive.
Do people show personality differences in their overall self-control or
ability to delay gratification? Warren Bickel and his colleagues have studied
drug abusers in situations analogous to the ones just described (see Bickel
& Johnson, 2003; Bickel & Marsch, 2001). Drug users seem to choose an im-
mediate “hit” of a drug over performing more positive, prosocial behaviors
that might be more rewarding in the long term. Several studies (e.g., Kirby,
Petry, & Bickel, 1999; Madden, Bickel, & Jacobs, 1999) have confirmed that
Behavior and Its Consequences   271

heroin abusers may be more likely than others to choose immediate over
delayed rewards, even when the rewards are hypothetical and financial.
Heroin abusers thus behave as if they are generally impulsive, with curves
relating reward value to delay (see Figure 7.12) that are generally very
steep—so that the value of a distant, large reward goes to zero very quickly.
In fact, steeper discounting rates are correlated with a variety of behavioral
problems, such as drug dependence, problem gambling, obesity, and HIV
risk behaviors (e.g., see Bickel et al., 2011). Which causes which? Does a
person’s tendency to discount steeply cause him or her to develop such
problems, or do the problems cause the person to start discounting more
steeply? There is evidence that a tendency to discount steeply can lead to
future problems. For example, animals that are steeper discounters for
food later self-administer more alcohol or cocaine than other animals (e.g.,
Carroll, Anker, Mach, Newman, & Perry, 2010), and high school students
who discount money steeply are more likely than other students to become
smokers later (Audrain-McGovern et al., 2009). Similarly, “self-control”
measured in children with other methods (e.g., by surveying teachers and
parents about a child’s perseverance, impulsivity, etc.) can also predict
future unemployment (Daly, Delaney, Egan, & Baumeister, 2015) and sub-
stance dependence, physical health, personal finances, and even criminal
behavior when they are adults (Moffitt et al., 2011). On the other hand,
rats that self-administer cocaine can show steeper discounting for food
subsequently than do controls that did not self-administer cocaine (e.g.,
Mitchell et al., 2014). Thus, the causal arrow may go both ways. It is worth
noting that people can also have steep discounting curves for some reinforc-
ers and shallow ones for others (e.g., Chapman, 1998; Green & Myerson,
2013), suggesting that there is not necessarily a single, steep-discounting
personality type. Especially interesting is that although cigarette smokers
show steep curves relating reward value to delay (as do heroin users),
ex-smokers do not. This difference suggests that the tendency for smok-
ers to discount delayed reward so steeply may be reversible rather than a
permanent personality trait (Bickel, Odum, & Madden, 1999).
Nudging better choices
Thanks in part to research on delay discounting and the quantitative law
of effect, the idea that organisms always have choice is now central to
modern thinking about operant behavior. It is also central to how we treat
operant behavior in the clinic, where the goal is often to help encourage
people to make better choices. For example, so-called contingency man-
agement or incentive-based treatments are now widely used to help
reduce undesirable behaviors like smoking, overeating, and drug abuse
(e.g., Higgins, Silverman, & Heil, 2008). In these treatments, individuals are
given reinforcers (like money, prizes, or vouchers that can be exchanged
for things like ski passes or camera equipment) for abstaining from the
undesired behavior and performing more desirable ones. The contingencies
272  Chapter 7

encourage the person to choose the alternative reinforcers over the harmful
ones. Contingency management treatments can be successful at reducing
behavioral excesses (see Andrade & Petry, 2014, for one recent review).
In one of my favorite examples, researchers used vouchers to reinforce
pregnant women for not smoking (see Higgins et al., 2012). Smoking dur-
ing pregnancy has many negative consequences, like increasing the odds
of preterm birth and restricting growth of the fetus. In the Higgins et al.
(2012) study, mothers in a treatment group received vouchers over several
weeks if they did not smoke, whereas mothers in a control group received
the same vouchers without being required to quit. The treatment (giving
vouchers for not smoking) successfully reduced the mothers’ smoking.
But even more important, it increased their babies’ birth weights and de-
creased the percentage of babies who were born underweight. By reinforc-
ing healthy behavior in the pregnant mothers, contingency management
thus improved the well-being of both the mother and the child.
In their book Nudge: Improving Decisions About Health, Wealth, and Hap-
piness, Richard Thaler and Cass Sunstein (2008) suggest that people can
be helped to make better choices by subtly modifying the environments
in which they make them. For example, people in a cafeteria line might
choose healthier foods if the healthy foods are placed at eye level and the
less healthy foods are harder to reach. The people are free to choose the
foods they want, but the healthy choices are subtly encouraged—they are
“nudged.” In another example, people may save more money for retire-
ment (with beneficial long-term consequences) if they are enrolled au-
tomatically in a savings plan than if they have to fill out forms to enroll
themselves once they are eligible. Building such nudges into the environ-
ment (and thereby practicing sensible “choice architecture”) reminds me
of earlier ideas of B. F. Skinner, who in many places (including his classic
book Beyond Freedom and Dignity, 1971) argued that we can use the science
of behavior to create environments with contingencies of reinforcement
that would increase human happiness and well-being.
Behavioral economics: Are reinforcers all alike?
Many psychological perspectives on choice, like the quantitative law of
effect and the Ainslie-Rachlin rule, accept a perspective that has been com-
mon in the psychology of learning: Different reinforcers are supposed to
be all alike and can be scaled along a single dimension of value. In truth,
this is an oversimplification. The world is full of things that reinforce but
do not necessarily substitute for one another. A Pepsi might substitute
perfectly for a Coke (for most of us, anyway), but neither one is quite the
same as a good copulation or a good book.
The issue of reinforcer substitutability has been recognized by operant
psychologists who have increasingly turned to principles of economics to
help understand behavior (e.g., Allison, 1979, 1983; Bickel, Johnson, Kof-
farnus, MacKillop, & Murphy, 2014; Hursh, 1980, 2014). The area within
operant psychology in which behavior is analyzed using economic prin-
Behavior and Its Consequences   273

ciples is called behavioral economics. It is an important and growing field.


In general, substitutability is recognized as a continuum that describes
various ways in which reinforcers can actually interact with one another
(Green & Freed, 1998).
A good way to discover the substitutability of two reinforcers is to vary
the cost of obtaining one reinforcer and see how it affects consumption of
the other. From an economic perspective, reinforcers are commodities that
organisms consume. As every economics student knows, the consumption
of a commodity depends on its price: As the commodity becomes more
expensive, we consume less of it; our “demand” for it decreases. The re-
lationship between price and consumption is easy to study in the operant
laboratory, where we can increase the price of a commodity by increas-
ing the amount of work required to earn it—that is, by manipulating the
schedule of reinforcement. Increasing the price of food by increasing the
fixed ratio schedule, for example, decreases the consumption of food (e.g.,
Foltin, 1991; Lea & Roper, 1977).
Figure 7.15 shows hypothetical data indicating how the consumption
of a commodity or reinforcer might change as a function of its “price.”
This sort of function, showing demand for a commodity as a function of
its price, is known as a demand curve. As described above, consump-
tion tends to decrease as the price increases. That is especially true of
commodities that are said to be “elastic,”—consumption of them is very
dependent on price (see Figure 7.14A). So-called inelastic commodities are
ones whose consumption is fairly constant over price (see Figure 7.14B).
Commodities whose demand functions are less elastic are necessities,
whereas those commodities whose demand functions are very elastic are
luxuries. Hursh and Natelson (1981) compared the rat’s demand curves
for food and for electrical stimulation of the brain. When price was in-
creased with a leaner VI schedule, the demand for brain stimulation
declined. As price for food increased, however, there was little decline
in the demand for food.
Now let us consider what effect increasing the price of one reward
might have on the consumption of a second reward that is also avail-

(A) Elastic (B) Inelastic Figure 7.15  Demand curves


describe the demand for a
commodity as a function of
its price, or fixed ratio (FR)
Consumption

requirement). (A) Demand for


an elastic commodity is very
sensitive to price; therefore,
demand typically decreases as
prices increase. (B) Demand for
an inelastic commodity is not as
Price (FR) Price (FR) sensitive to price.
274  Chapter 7

(A) Substitutes (B) Independents (C) Complements

Consumption

B
B A A
A

Price of A Price of A Price of A

Figure 7.16  Demand for two commodities (A and B) as one of them (A) increas-
es in price. Different panels illustrate commodities that are substitutes, indepen-
dents, or complements of each other. Commodities are not always substitutes
for one another; that is, reinforcers are not all alike.

able. Three possibilities are shown in Figure 7.16. In Figure 7.16A, as the
price of one reinforcer, A, increases, its own consumption declines, and the
consumption of the other reinforcer, B, goes up (e.g., Rachlin et al., 1976).
These reinforcers are true substitutes: Subjects essentially exchange one
commodity for the other. Real-world examples might be Coke and Pepsi
or potato chips and pretzels. In Figure 7.16B, the commodities are inde-
pendents: As the price of one commodity, A, goes up and its consumption
goes down, the consumption of the other commodity, B, does not change.
Examples here are compact discs and umbrellas, or Cokes and books.
Figure 7.16C describes complements: As the price of one commodity, A,
goes up and its consumption decreases, so does consumption of the other
commodity, B. Examples are bagels and cream cheese, hot dogs and hot
dog buns, or chips and salsa. The various relationships that are possible
between rewards make it clear that different reinforcers are not necessar-
ily equivalent. To understand the effects of one reinforcer, we must know
what that reinforcer is and what its alternatives really are.
The substitutability concept is useful when we try to understand the
interactions between reinforcers in the real world. For example, although
drugs of abuse are reinforcers, they are not mere substitutes for one another.
Bickel, DeGrandpre, and Higgins (1995) summarized 16 studies in which
two or more drugs were available to humans or animals, and consump-
tion was measured while the price of one drug was manipulated. They
found evidence of substitutability, independence, and complementarity.
For example, different ways of administering the same drug (e.g., orally
or intravenously) were substitutes, showing an arrangement like the one
described in Figure 7.16A. Ethanol also appears to be a substitute for PCP:
As the price of PCP increased, its consumption decreased and that of etha-
nol increased. (Interestingly, the reverse did not hold true: Increasing the
price of ethanol decreased ethanol consumption but did not change PCP
Behavior and Its Consequences   275

consumption.) Other drugs were complements: When the price of alco-


hol increased, its consumption decreased, and so did the consumption of
cigarettes. These relationships clearly demonstrate the utility of economic
concepts when thinking about real-world reinforcers.
One point of this discussion is that reinforcers are not all alike. Al-
though they have similar effects (each reinforcer increases the probability
of a behavior that leads to it), they do not necessarily substitute for one
another. Instead, reinforcers are real things that can interact in different
ways. The quantitative law of effect and self-control theory are excellent
tools for understanding the principles governing choice when the payoffs
are substitutable. But when rewards have relationships of independence
or complementarity, other principles will also be required.
Another point is that price affects choice. Consistent with the curves
in Figures 7.15 and 7.16, the price of foods can influence which foods we
buy, and this simple idea can be used to provide another kind of nudge
to encourage healthy behavior. For example, we know that if the price of
an unhealthy food item is increased, consumption of it will go down; in a
complementary way, if the price of a healthy food is decreased, consump-
tion of it will go up (for reviews, see, e.g., Epstein et al., 2012; Thow, Downs,
& Jan, 2014). Which is the better nudge? Is it more efficient to tax the un-
healthy food options or subsidize the healthy ones? Epstein, Dearing, Roba,
and Finkelstein (2010) tested this question by having mothers shop in a
simulated grocery store in which they were given a fixed amount of money
to buy food for the home. In some conditions, all the less healthy foods (e.g.,
hot dogs, potato chips, and mayonnaise) were made more expensive, and
in other conditions, all the healthier foods (e.g., chicken, broccoli, nonfat
cottage cheese) were made cheaper by
a similar amount. Which strategy led
to healthier purchases overall (Figure
7.17)? As you might expect, increas-
ing the price of the unhealthy options
decreased the total calories that were
purchased. Surprisingly, though, de-
creasing the price of healthy foods ac-
tually led to the mothers purchasing
more calories overall (see also Giesen,
Havermans, Nederkoorn, & Jansen,
2012). Apparently, decreasing the price
of the healthy foods did increase pur-
chase of these items, but it left the par-
ticipants with extra money to spend on
less healthy (but still attractive) foods!
In a very preliminary way, this kind of
result suggests that taxing junk food
might have a better effect on choice
than subsidizing healthy foods. But Figure 7.17  Choice in the grocery store aisle.
276  Chapter 7

choice can be influenced—and is nudgeable—by manipulating the price


of different rewards.

Theories of Reinforcement
Skinner’s definition of a reinforcer (something that leads to an increase
in a response it is made a consequence of) is not a theory. It describes a
relationship between a behavior and a consequence, but it does not explain
how or why the reinforcer increases the behavior or what kinds of things
will be reinforcers and what kinds of things will not be. Notice that this
is equally true of the matching law. When we use the matching equation,
we count the responses made and the reinforcers earned, plug them into
the equation, and find that the equation worked. But notice that we use
the past tense. The matching equation describes what occurred, but it does
not predict exactly how an individual organism will match, nor does it
predict what events will be reinforcers. It is a perfect extension of Skinner’s
reinforcement principle.
For these reasons, Skinnerian definitions of reinforcers are sometimes
called “circular” (e.g., Postman, 1947). The utterly empirical nature of these
descriptions can be considered a strength because each description auto-
matically accommodates the potentially enormous range of factors that
might reinforce different behaviors and different individuals. Got a kid
who is acting out in school? Just find whatever it is that is a consequence of
the behavior and manipulate it—that will usually work. But some people
want more than this; they want some kind of explanation of how reinforc-
ers work. And it would also be useful to know ahead of time what kinds
of events will reinforce a behavior. One way to break the circularity is to
note that reinforcers that work in one situation should also work in another
(Meehl, 1950); thus, we might be able to predict a reinforcer’s abilities by
its effects in another situation. It might also lead to a theory.
Drive reduction
Thorndike had the beginnings of a theory. He thought that satisfaction
stamped in the S-R connection. Reinforcers work because they strengthen a
connection, and they do so only if they are satisfying. But what is “satisfac-
tion”? Clark Hull, a theorist working from the Thorndike tradition, had a
related view (Hull, 1943). He thought that reinforcers reduced drives. His
theory of reinforcement was part of a much larger theory that was designed
to understand all motivation and learning. Hull thought that behavior was
organized in a way that always helped the animal satisfy its needs. The
theory can be summarized by imagining an animal who wakes up from a
good sleep in the forest. It has slept long enough that its body now needs
food—it is therefore hungry. Hunger, in turn, makes the animal restless and
active, and the heightened general activity eventually leads it to perform,
perhaps accidentally, an operant behavior that leads to food. The food
then reduces the need, and the reduction in need reinforces behavior. So,
Behavior and Its Consequences   277

next time that the animal needs food, the operant is performed again. In
this way, motivation, learning, and behavior were organized to meet the
organism’s needs.
In Hull’s theory, the need for food or water stimulated a theoretical
construct called Drive, which motivated behavior, increasing the strength
of general activity as well as the strength of behaviors that had been learned
(reinforced) before. It also provided the basis for reinforcement. According
to Hull (1943), reinforcement occurs when the event that is made a conse-
quence of behavior reduces Drive. Unlike Skinner’s atheoretical approach,
Hull’s theory immediately indicated a class of events or stimuli that would
reinforce—that is, things that would reduce Drive.
Hull’s general theory shaped research in the field of learning and mo-
tivation for many years. However, many parts of the theory did not hold
up to careful testing (see Bolles, 1967, 1975; see also Chapter 9), and the
Drive reduction theory of reinforcement was one of the first elements to
go. Contrary data were soon gathered by Fred Sheffield, a colleague of
Hull’s at Yale University, who showed that commodities that did not reduce
need could still reinforce. Sheffield and Roby (1950) found that saccharin, a
nonnutritive sweetener, served as a reinforcer. The body does not need sac-
charin. Sheffield, Wulff, and Barker (1951) also found that male rats would
learn to run into an alley to copulate with a female, even if the copulation
was terminated before the rat had ejaculated. Presumably, a “need” had
not been reduced; one could even argue that a new one had been aroused.
Neal Miller (1957) defended Hull’s position against these findings. Mill-
er pointed out that although the body did not really “need” saccharin, there
was something like a drive for it anyway. He showed that if a rat was fed
a little saccharin, the rat would consume less of it when saccharin was of-
fered again later; the animal also would not work as hard for it. So, feeding
a rat saccharin did reduce some kind of motivation to get the saccharin. In
this way, Miller reconceptualized Hull’s concept of Drive. But Miller had
broken the connection between Drive and biological need, and the theory
found fewer and fewer adherents after that. The loss of popularity was
partly because of other problems with the concept (see Chapter 9). It was
also because another view of reinforcement came along and carried the day.

The Premack principle


In the late 1950s and early 1960s, David Premack presented a theory that
more or less changed everything (see Premack, 1965, 1971a, for reviews).
He offered a completely new way to think about how reinforcers work in
operant learning. When most of us think about the classic experiment in
which a rat presses a lever for food, we usually see the crucial reinforcement
contingency as one between a behavior (the lever press) and a stimulus (the
food pellet). Instead, Premack noted that the real contingency is between
two behaviors. By giving the rat a food pellet after each lever press, we
are making the opportunity to eat contingent on pressing a lever. In any
operant learning situation, there is an instrumental act that provides ac- David Premack
278  Chapter 7

cess to a “contingent” behavior. The reason that lever pressing increases


in this arrangement is that it allows access to eating, a behavior that the
rat would prefer to do.
Premack’s idea is that reinforcement occurs when the instrumental act
allows access to a more preferred behavior. The animal’s preferences can be
tested ahead of time by giving the subject free access to both behaviors and
then finding which behavior it spends the most time doing. In the lever-
press-and-eat experiment, if we had allowed the rat to choose between
lever pressing and pellet eating, we would have found the animal spending
more time eating than pressing the lever. (Experimenters usually guaran-
tee this preference by depriving the animal of food beforehand.) Eating
can therefore reinforce lever pressing because eating is preferred to lever
pressing. By pointing to the relevance of a preference test, Premack’s theory
allows one to predict what kinds of events will be reinforcing. Moreover, it
has tremendous generality. Suppose that we allowed a child to play with
several toys and timed the number of seconds that the child spent playing
with each toy. We might discover that the child spends the most seconds
playing with Toy A, the second-most seconds with Toy B, the third-most
seconds with Toy C, and so forth. Other children might rank the toys very
differently. But Premack would predict that for this particular child, access
to Toy A would reinforce playing with either Toy B or Toy C, and Toy B
might reinforce playing with Toy C (but not Toy A). All we need to know
is how the subject spends his or her time engaging in the various alterna-
tives in an initial preference test.
Premack showed the power of the principle in several clever experi-
ments. In one of them (Premack, 1963b), he conducted an experiment much
like the one just described, except that the subjects were cebus monkeys
rather than children. He first observed several monkeys interacting freely
with toys that they could manipulate. In initial preference tests, when the
items were freely available, one of the monkeys, Chicko, spent most of its
time pulling on a hinged flap, then opening a door, and then pushing on
a plunger. Premack predicted (and found) that access to the hinged flap
would reinforce opening the door and pushing on the plunger. That is,
door opening and plunger pushing increased when they allowed access
to manipulating the flap. On the other hand, although opening the door
reinforced plunger pushing, it did not reinforce flap flipping. A behavior
was only reinforcing if it had been preferred to the operant behavior in the
original tests. Monkeys with different preferences showed different results.
All these findings are consistent with Premack’s principle that reinforce-
ment occurs when the instrumental act allows access to a more preferred
behavior.
In a similar experiment, Premack (1959) gave first-grade children a
choice between eating candy and playing at a pinball machine. Some of the
kids spent more time eating candy than playing pinball. Not surprisingly,
for these children, when candy was made contingent on playing pinball,
Behavior and Its Consequences   279

pinball playing increased. Other children, though, preferred to spend more


time playing pinball than eating candy in the initial preference test. For this
latter group of kids, pinball playing reinforced candy eating—the reverse of
the effect in the other group. The Premack principle predicts what things
will reinforce other things, but it also accepts large individual differences.
All that matters is the individual’s initial preference ranking.
One of the interesting things about the pinball/candy experiment is
that it demonstrates that there is nothing special about sweets (or foods) as
reinforcers. Candy eating served as either reinforcer or operant, depending
on how the kids initially ranked it relative to pinball playing. This point
was made brilliantly in another experiment, this time using rats (Premack,
1962). Premack built a running wheel in which rats could either run or
drink water from a spout; access by the rats to either the wheel or the water
could be controlled. In one condition, rats were first deprived of water. In
a preference test with both running and drinking available, they therefore
spent more time drinking than running. Not surprisingly, but consistent
with the principle, drinking reinforced running. However, in an interest-
ing second condition, Premack gave rats all the water they needed but
deprived them of running. In the preference test, the rats’ preference was
now reversed, and they preferred running to drinking. And, consistent with
the principle, in this case running reinforced drinking. This finding was a
bit more surprising. There is nothing special about water as a reinforcer;
in fact, the relation between instrumental act and reinforcing act was said
to be reversible.
One of the interesting features of Premack’s approach is that it uses
no theoretical constructs whatsoever. It makes testable predictions, but
it is as empirical as Skinner could ever have
wanted. Premack was not interested in what
actually caused an organism to prefer one 40
activity over another; in fact, he was quite
indifferent to this question. All that mattered 16% sucrose
were the subject’s current preferences, regard- 35
Bar presses/session

less of what the actual behaviors were or how


their levels came about. Like Hull, Premack 32% sucrose
30
would predict that food would be more rein-
forcing to hungry, rather than satiated, rats, Light wheel
but whereas Hull would have explained that 25
fact by noting that the hungry rats had greater 64% sucrose
Drive, Premack would merely note that the Heavy wheel
20
food-deprived rats would have a larger pref- 0.1 0.2 0.3 0.4 0.5
erence for eating. It is the strength of the pref- Baseline probability
erence, regardless of how it got there, that de- Figure 7.18  Reinforcing effect of different
termines the reinforcement effect. Figure 7.18 reinforcers as a function of the rat’s prefer-
shows the results of a rat experiment in which ence for each reinforcer (baseline probabil-
a variety of sucrose solutions (16%, 32%, or ity). (After Premack, 1965.)
280  Chapter 7

64%) or the opportunity to run in a heavy wheel (HW) or a light wheel (LW)
were each made contingent on lever pressing (Premack, 1963a). Regardless
of the type of reinforcer, its reinforcing effect on lever pressing was clearly
predicted by its initial baseline probability (x-axis).
Premack went on to run some famous, although unrelated, experiments
on language learning in chimpanzees (e.g., Premack, 1971b). Before he left
the response learning question, though, he extended the principle to the
case of punishment (Premack, 1971a). Punishment, of course, is the situa-
tion in which something made contingent upon an instrumental behavior
weakens (rather than strengthens) the behavior. Premack’s idea, based on
the reinforcement rule, is that punishment will occur when the instrumental
behavior leads to a less preferred response. Thus, delivering a mild electric
shock after a lever press punishes lever pressing, not because the shock is
aversive or “annoying” (Thorndike), but because it elicits a behavior, like
fear, that is less preferred to lever pressing. The idea was tested in at least
one experiment (Weisman & Premack, as described in Premack, 1971a).
Rats were put into a running wheel with the drinking spout again. This
time, however, the wheel was motorized so that the experimenter could
flip a switch and force the rodent to run when this behavior was needed.
The rats were first deprived of water, and in a preference test, they showed
the usual preference for drinking over running. Then, running was made
contingent on drinking; when the rat drank, the motor was switched on,
and the rat was forced to run. Here, when drinking led to the less preferred
running, punishment was observed, and the rats stopped drinking. In a
control condition, rats were tested while they were not deprived of water.
In the preference test, these subjects preferred running to drinking. And
when drinks subsequently led to running initiated by turning on the motor-
ized wheel, wheel running reinforced drinking! There was nothing inher-
ently aversive about running in the motorized wheel. When running was
less preferred to drinking, it punished drinking, but when running
was more preferred to drinking, it reinforced drinking.
Problems with the Premack principle
The Premack principle was an important landmark in the development of
ideas about reinforcement. It began a fundamental shift toward thinking
about relations between behaviors rather than between behaviors and
stimuli, and at the same time, it avoided Skinner’s circularity problem.
It had its points of weakness, however. One stemmed from the problem
of how to arrange preference tests. Although the principle is very clear
that reinforcement effects should be predictable from prior preferences,
it is not always easy to know how to determine those preferences. For
example, consider sex and coffee drinking. It is quite possible that dur-
ing the course of a 24-hour day you spend more time drinking coffee
than having sex. (That is at least true of many professors I know.) This
fact implies that you prefer coffee to sex and that making coffee drinking
contingent on having sex will increase your amount of sexual activity. I
Behavior and Its Consequences   281

am not sure that anyone has run this experiment, but you might agree
that the principle seems to have things backward: It seems more likely
that sex will reinforce coffee drinking, not the reverse. In truth, an expert
investigator would say that this more sensible prediction actually follows
from the right kind of preference test. If we limited preference testing
to, say, 30 minutes or so with both coffee and a boyfriend or girlfriend
present, one might spend more time engaged in amorous activity than in
coffee drinking. Thus, based on an appropriate preference test, we might
correctly predict that amorous activity would reinforce coffee drinking.
But what is appropriate is not always clear ahead of time, which intro-
duces an ad hoc aspect to Premack’s principle.
Another problem is that many early experiments did not acknowledge
that when you make a preferred behavior contingent on another behavior,
you necessarily deny access to a preferred behavior that would ordinarily
fill some time. For instance, when pinball playing is made contingent on
candy eating, you do not allow pinball-playing until the kid eats some
candy. So, what is the kid going to do now that he or she cannot play pin-
ball? The contingency leaves a void that must be filled by some activity,
and maybe part of the increase observed in the instrumental activity occurs
because something has to increase. Several investigations have included
control conditions that help address this issue (see Dunham, 1977, for more
discussion).
There is nonetheless an important idea here: Reinforcement contin-
gencies always deprive an organism of the chance to engage in a behav-
ior that it would ordinarily spend some time doing. Perhaps the crucial
thing is that the instrumental behavior allows access to a behavior that
has been deprived in this way. The importance of this idea became clear
in experiments by Eisenberger, Karpman, and Trattner (1967) and Allison
and Timberlake (1974). Both research groups showed that access even to
a less preferred behavior could reinforce a more preferred behavior if the
reinforcing behavior was denied below its initial baseline level. Eisenberger
et al. (1967) let high school students manipulate a knob or a lever. Most
subjects spent more time turning the knob, although they also pressed the
lever at least a little bit. The experimenters then arranged things so that
turning the knob allowed access to lever pressing, which was otherwise
completely denied. Premack would predict punishment here because a
preferred behavior is leading to a less preferred response. Instead, lever
pressing reinforced knob turning, and the probability of knob turning actu-
ally increased. Allison and Timberlake (1974) observed a similar effect in
rats drinking different concentrations of a saccharin solution. Drinking a
less preferred solution could reinforce drinking a more preferred solution
if the contingency deprived the rat of drinking the less preferred solution
below its baseline level.
The point of these results is that access even to a less preferred behavior can
be reinforcing if its baseline level has been denied or deprived. Behavior depriva-
tion is inherent in the typical situation in which a more preferred behavior
282  Chapter 7

is made contingent on a behavior that is less preferred. It turns out that de-
privation of the contingent response—rather than its higher probability—is
the thing that makes reinforcement possible. The Premack principle is a
good place to start, but it is not the final answer to the reinforcement story.
Behavioral regulation theory
Timberlake and Allison (1974) summarized the new idea in a view that
is known as the response deprivation hypothesis (see also Timberlake,
1980, 1984; Timberlake & Farmer-Dougan, 1991). The idea of response de-
privation is that every behavior has a preferred level, and once access to
it is restricted, we will perform another behavior to get back to it. Thus,
access to any behavior that we perform at some level during a baseline test
will become reinforcing if we are deprived of that behavior. It is as if we are
motivated to defend and regulate a certain distribution of activities. When
a reinforcement contingency denies an activity, we will do what we can to
get it back. The general view is sometimes known as behavioral regulation
theory (see Timberlake, 1980, 1984; Timberlake & Farmer-Dougan, 1991).
An explicit version of the theory makes some interesting further pre-
dictions (Staddon, 1979). Figure 7.19 shows two possible behaviors, A
and B, on the x- and y-axes, respectively. The
200
black dot represents a hypothetical animal’s
baseline preference between the two possible
Baseline or “bliss point”
behaviors; that is, when the animal was given
150 free access to both behaviors in a preference
5 test, it performed behavior A 50 times and
behavior B 150 times. This relationship be-
Behavior B

3
100 tween A and B at baseline is called a bliss
1
point because it is the amount of A and B that
2 are chosen when everything is free. The idea
50 of behavioral regulation theory is that when
4
deprived of either A or B, the animal will do
its best to return to that point of bliss.
0 Let us consider a situation in which the
50 100 150 200
Behavior A animal must now perform behavior A to have
access to behavior B. The line in Figure 7.19
Figure 7.19  The minimum distance model. represents one possible arrangement—a ratio
The black dot describes the baseline level schedule of reinforcement in which the ani-
or “bliss point,” that is, the level of two mal must perform exactly one A response to
behaviors (A and B) when they can be made
earn one B response. Notice that the schedule
freely
Q: Open in circle
a preference test. When
has been changed a certain
to a black circle.
number of need
A responses
constrains the animal’s behavior; it sets up a
Caption will updating. are required for
every B response, the range of possibili- limited number of possible pairs of A and B
ties are indicated by a line. We perform the behaviors and makes it impossible for the an-
number of A responses that get us closest to imal to return to the bliss point. What should
the bliss point (point number 1 on the line). the animal do now? According to the mini-
(After Allison, 1989.) mum distance model (e.g., Staddon, 1979), it
Behavior and Its Consequences   283

Baseline or “bliss point” Figure 7.20  The number of responses that get
200
us closest to the bliss point depends on the
reinforcement schedule. Each line radiating out
150 from 0 represents a different ratio schedule in
which a different number of A responses (e.g.,
Food pellets

lever presses) are required for each B response


100 (e.g., eating of food pellets). The black dot is the
bliss point; the red dots are the points on each
schedule line that are closest to the bliss point.
50 (After Allison, 1989.)

0
50 100 150 200
Lever presses

Q: Open circle has been changed to a black circle.


will getwillasneed
Caption close as possible to the bliss point;
updating. the animal should make the
number of A responses that gets it the “minimum distance” from the bliss
point. The idea is easy to understand graphically. The closest point on a line
will be the point that allows a perpendicular from the line to hit the bliss
point. As Figure 7.19 illustrates, any other possible point on the line will be
farther away from the bliss point. The minimum distance model therefore
predicts that the organism will perform the number of A responses that
will get it closest to the bliss point. In this example, the animal will make
about 100 A responses and then stop. Stopping shy of 100, or going beyond
it, will merely take the animal farther away from the bliss point.
Now, you can undoubtedly appreciate that the number of A responses
required to get the minimum distance from bliss will depend on the re-
inforcement schedule. This idea is illustrated in Figure 7.20, which plots
several more ratio schedules as new lines relating the number of A re-
sponses (in this case, lever presses) required to get a B response (eating a
food pellet). In some of the schedules (the steepest lines), the animal does
not have to perform many A responses to get plenty of B responses; the
ratio is low. In other schedules (the shallow ones), the animal must work
considerably more; that is, the ratio is high. Using the perpendicular-line
technique from above, it is easy to show that each schedule has its own
number of A responses that gets as close as possible to the bliss point; these
are shown by the points on each of the lines. Something rather interesting
emerges: As we increase the schedule ratio (going from very steep to very
shallow lines), the rat will first perform more responses and then fewer and
fewer. This kind of function is consistent with the results of a number of ex-
periments (see Staddon, 1979). Moreover, it makes interesting connections
with economic analyses (Allison, 1989). First, notice that the reinforcement
schedules that ought to produce the most lever presses—the most work, if
you want to think of it that way—are the intermediate ones. Neither leaner
284  Chapter 7

nor richer schedules produce as much behavior as the ones in between—


future employers and managers should take note. Note also that, as the
price of food (or the amount of work required) increases (with shallower
and shallower lines), the overall consumption of food declines—the famil-
iar demand-curve described above. Once again, we discover connections
between operant behavior and economics (e.g., Allison, 1983).
According to John Staddon (1979), experiments with ratio schedules
seem to follow the minimum-distance model. The model can also be
applied to interval reinforcement schedules, although those are differ-
ent because the schedule functions relating the number of behavior A to
B responses are not simple, straight lines. Still, the minimum-distance
model predicts a similar relationship between reinforcement rate and
behavior rate: As we increase the potential reinforcement rate relative to
the behavior rate, behavior rate should first increase and then decline.
John Staddon
It is interesting that the matching law (or quantitative law of effect, see
Equation 7.6) also makes predictions here; however, it does not predict
that response rate will eventually decline. In fact, as shown in Figure
7.11, a decline does not occur with interval schedules in pigeons (e.g.,
Catania & Reynolds, 1968); that is, beyond a certain reinforcement rate,
further increases in the reinforcement rate cause no further change in
behavior rate. The matching law seems more accurate here. In general,
the matching law seems to do a better job describing behavior on interval
reinforcement schedules, whereas the minimum-distance model does a
better job describing behavior on ratio schedules (Staddon, 1979). The
minimum-distance model does, however, provide some insight into what
actually makes a reinforcer reinforcing.
Selection by consequences
It would be a mistake to end this chapter without mentioning one last per-
spective on how reinforcers might operate. Several writers have noted the
parallel between the effects of reinforcers and evolution by natural selection
(e.g., Donahoe, 1998; Donahoe & Palmer, 1994; Skinner, 1981; Staddon &
Simmelhag, 1971). According to this idea, reinforcers select behaviors by
essentially weeding out the behaviors that are less useful. This idea is a bit
different from the traditional Thorndikian view that reinforcers “stamp in”
the behaviors that lead to them.
This idea was actually introduced in Chapter 2 during the discussion of
shaping. Remember that natural selection operates through elimination. In
Chapter 2, we saw how the color of pepper moths in England got darker
and darker over generations during the Industrial Revolution. At the start
of the Industrial Revolution, some moths were dark in color, many were
light in color, and others were probably somewhere in between. Over the
years, as the trees darkened because of increasing factory smoke, birds
began to detect light-colored moths on the trees more readily than before,
ate them, and thereby eliminated the light-colored moths (and their poten-
Behavior and Its Consequences   285

tial offspring) from the moth population. The moth population therefore
gradually shifted from light-colored moths (less visible on light trees) to
dark-colored moths (less visible on darkened trees). There are two impor-
tant things to note here. First, there was some variation in the shading of
the moths. Second, natural selection then acted on that variation by elimi-
nating the less viable moths.
The idea that reinforcement works the same way was nicely described
by Staddon and Simmelhag (1971). As in evolution, the organism starts
with a variable set of behaviors from which reinforcers can then select. The
initial behaviors are brought about by “principles of variation” that are
analogous to those provided by genetics and heredity in evolution. That
is, the animal initially does certain things in a new environment because
of generalization from previous environments, innate tendencies in the
environment, and so on. Reinforcers then select from these variations, just
the way natural selection does. The behaviors that do not lead to reinforce-
ment drop out—they extinguish, or (to use the word from evolution rather
than learning theory) they become extinct. Reinforcers basically keep some
behaviors from being eliminated this way. The noneliminated behaviors
remain in the population, and new variations are then produced through
processes like induction and generalization (see Chapter 2). Reinforcers
then select again, and so on. The idea is that an understanding of how re-
inforcers work will require an understanding of (1) principles of behavior
variation and (2) selection by elimination just like evolution.
Toward the end of his long career, Skinner himself emphasized the
parallel between evolution and operant learning (Skinner, 1981). Both pro-
cesses involve a principle he called “selection by consequences.” He noted
that the selection-by-consequences idea is relatively new in science, hav-
ing been discovered around the time of Charles Darwin in the middle of
the nineteenth century. This concept was different from earlier scientific
principles, which usually required mechanical forces, energies, and initi-
ating causes to make things work. In contrast, selection by consequences
just happens; no causal agent (like a creator in the case of evolution or a
creative mind in the case of operant learning) is required.
Despite the simplicity of variation and selection, great complexity can
emerge with their repeated application. This point is nicely illustrated in
Richard Dawkins’s book, The Blind Watchmaker (1986). The intricacy and
complexity of living things suggests to many observers that they must
have been designed by a great, creative mind—that is, a skilled and pow-
erful watchmaker. But in fact, no designer is required—the “watchmaker”
might be blind. Great complexity can arise over the millions of generations
in evolutionary time—or the millions of operant conditioning trials that
presumably occur in an organism’s lifetime. Dawkins, whose focus was
evolution and not operant conditioning, created a computer program that
generated simple “biomorphs” by variations in several “genes” that were
built into the program. The program could start by generating a simple
286  Chapter 7

Figure 7.21  Repeated cycles of variation and selection can produce impressive
beauty and complexity. Dawkins (1986) used the changes in this “biomorph”
over variation/selection cycles to illustrate the process of evolution. The same
case can be made for operant conditioning in that complex behaviors might
emerge from the extremely large number of variation and selection cycles that
can occur in an organism’s lifetime. (From Dawkins, 1986.)
Behavior and Its Consequences   287

item (Figure 7.21), and then create a set of minor, random mutations of
it. Dawkins himself would then select one of the mutations, and discard
(eliminate) the others. The selected mutation would then be allowed to re-
produce. That is, the computer created new mutations of it. Then Dawkins
selected again from this new set of variations. After a number of such
variation-and-elimination cycles, some beautiful and complex “organisms”
began to emerge. An example is shown in Figure 7.21. Repeated repetitions
of a simple variation/selection process can produce end products of im-
pressive beauty and complexity (Figure 7.22). The same could be true of
operant conditioning. Human behavior, which seems so miraculous and so
complex, might similarly emerge from the very large number of variation/
selection cycles that occur in a typical lifetime.
What impact has the variation/selection idea had on Learning Theory?
Although the parallel between reinforcement and selection is fascinating, it
has not stimulated much new research. There has been progress in under-
standing some of the principles of variation (see Balsam, Deich, Ohyama,
& Stokes, 1998; Neuringer, 1993, 2004), and investigators now recognize
that reinforcers function to select—as well as to strengthen—operant be-
havior (e.g., Williams, 1988). But there has been little research pursuing the
idea that reinforcers work mainly by weeding out instead of stamping in.
Advocates of the selection-by-consequences view (e.g., Donahoe, 1998; Do-
nahoe & Burgos, 2000; Donahoe, Burgos, & Palmer, 1993; McDowell, 2004)
sometimes tacitly endorse a stamping-in type of reinforcement mechanism.

Swallowtail Man in hat Lunar lander Precision balance

Caddis Scorpion Cat’s cradle Tree frog

Spitfire Crossed sabres Bee-flower Shelled cephalopod

Figure 7.22  Different “bio-


morphs” that “evolved” in
Dawkins’s (1986) computer
Insect Fox Lamp Jumping spider Bat software program. (From
Dawkins, 1986.)
288  Chapter 7

Although the selection-by-consequences view provides an interesting per-


spective on response learning, it is somewhat unique among the perspec-
tives reviewed in this chapter in that it has yet to generate a new line of
research that has led to new insights.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. Early thinkers had different ideas about what was going on in instru-
mental learning. Thorndike emphasized reinforcement: Satisfaction was
supposed to stamp in an S-R association. Guthrie claimed that rein-
forcement was not necessary for learning; S and R were associated if
they merely occurred together in time. Tolman argued that learning was
mostly about stimuli (S-S) and that reinforcers were important for moti-
vating performance even though they were not necessary for learning.
The early theorists identified at least three possible functions of rein-
forcers: They might stamp in behavior (Thorndike), they might provide
another stimulus (Guthrie), and they might motivate (Tolman).
2. Skinner’s “atheoretical” approach emphasized the strengthening effects
of both primary and conditioned reinforcers and also emphasized stimu-
lus control, the concept that operant behaviors occur in the presence
of stimuli that set the occasion for them. Skinner invented the operant
experiment, which is a method that examines how “voluntary” behavior
relates to its payoff—the animal is free to repeat the operant response
as often as it chooses.
3. Schedules of reinforcement provide a way to study how behavior relates
to payoff. Ratio schedules require a certain number of responses for
each reinforcer, and there is a direct relationship between behavior rate
and payoff rate. Interval schedules reinforce the first response after a
specified interval of time has elapsed. In this case, there is a less direct
relationship between behavior rate and payoff rate. Different schedules
generate their own patterns of responding.
4. Choice is studied in concurrent schedules, where two behaviors are
available and are reinforced according to their own schedules of re-
inforcement. Choice conforms to the matching law, which states that
the percentage of behavior allocated to one alternative will match
the percentage of reinforcers earned there. Even the simplest oper-
ant experiment involves this kind of choice because the organism must
choose between lever pressing and all other available behaviors, which
are reinforced according to their own schedules. According to the quan-
titative law of effect—an extension of the matching law—the rate of a
Behavior and Its Consequences   289

behavior always depends both on its own reinforcement rate and on the
reinforcement rate of other behaviors.
5. We often have to choose between behaviors that produce large,
delayed rewards versus behaviors that yield smaller, but more immedi-
ate rewards. We are said to exercise self-control when we choose the
large, delayed reward, but we are seen as impulsive when we go for
the smaller, immediate reward. Choice here is a lawful function of how
a reinforcer’s value depends on both its size and its imminence in time.
Self-control can be encouraged by committing to the delayed reward
earlier (precommitment) and by several other strategies.
6. Reinforcers are not all the same. According to economic principles,
reinforcers may substitute for one another (e.g., Coke for Pepsi, and
vice versa), they may be independent of one another (Pepsi and books),
or they may complement one another (chips and salsa). A complete
understanding of choice will need to take into account the relationship
between the different reinforcers.
7. The Premack reinforcement principle states that access to one behav-
ior will reinforce another behavior if the first behavior is preferred in a
baseline preference test. The principle has very wide applicability, and
it rescues the Skinnerian definition of a reinforcer (“any consequence of
a behavior that increases the probability of that behavior”) from its cir-
cularity. Premack’s punishment principle states that access to a behavior
will punish another behavior if the first behavior is less preferred.
8. According to behavior regulation theory, animals have a preferred level
of every behavior that they engage in. When a behavior is blocked or
prevented so that it is deprived below its preferred baseline level, ac-
cess to it becomes reinforcing. Behavior regulation theory replaces the
Premack principle as a way of identifying potential reinforcers.
9. Reinforcers may operate like natural selection. According to this idea,
they may select certain behaviors largely by preventing them from elimi-
nation (extinction). As is true in evolution, great subtlety and complexity
can emerge over time from repeated application of principles of varia-
tion and selection.

Discussion Questions
1. The early theorists (Thorndike, Guthrie, and Tolman) had very different
ways of thinking about instrumental learning. Summarize their views
with an emphasis on (a) how they conceptualized the effects of reinforc-
ers and (b) what they thought was actually learned. How do their views
contrast with those of B. F. Skinner?
2. Provide an example of an operant behavior that you recently observed
in your own behavior or that of a friend over the last few days. Then
290  Chapter 7

use it to illustrate the concepts of discriminative stimulus, conditioned


reinforcer, schedules of reinforcement, and behavior chain.
3. What does the quantitative law of effect say influences the strength of
an operant behavior? Using the law, recommend ways to (a) decrease a
child’s overconsumption of candy and (b) increase the amount of time
that a child pays attention to his or her lessons in class.
4. Explain how the Ainslie-Rachlin rule predicts that a person’s choice of
a small and more immediate reward (over a bigger but more delayed re-
ward) can be influenced by the time when the choice is made. How can
individuals be encouraged to choose the delayed reward?
5. Modern views of operant behavior generally accept the idea that choice
is everywhere—organisms are always engaged in choosing between
different behaviors and their associated reinforcers. Choice was actually
discussed at several different points of the chapter (e.g., the quantita-
tive law of effect, delay discounting, behavioral economics). After think-
ing about these perspectives, describe what you believe are the key
factors that influence choice.
6. On her days off, a student likes to read, cook, play tennis, watch movies,
play cello, and talk to her family and friends. (These activities are not
listed in any particular order.) How could you use the Premack principle
to encourage her to spend more time practicing cello? How would you
conceptualize that from the perspective of behavioral regulation theory?
7. Describe the idea that reinforcement works through selection by conse-
quences. Discuss the parallels with evolution. How does this perspec-
tive enrich or enhance the perspective on instrumental learning that is
provided by other theories of reinforcement?
Behavior and Its Consequences   291

Key Terms
behavior chain  255 discriminative stimulus quantitative law of
behavioral (SD)  254 effect  264
economics  273 Drive  277 ratio schedule  258
behavioral regulation fading  254 reinforcement
theory  282 fixed interval (FI) theory  246
bliss point  282 schedule  259 response deprivation
chained schedule  260 fixed ratio (FR) hypothesis  282
complements  274 schedule  258 SΔ  254
compound incentive-based schedules of
schedule  260 treatment  271 reinforcement  257
concurrent independents  274 SD  254
schedule  260 interval schedule  259 secondary
conditioned latent learning reinforcer  255
reinforcer  255 experiment  252 stimulus control  253
contiguity theory  247 learning/performance stimulus elements  247
contingency distinction  251 stimulus sampling
management  271 matching law  261 theory  248
continuous melioration  263 substitutability   272
reinforcement (CRF) minimum distance substitutes  274
schedule  257 model  282 superstitious
cumulative record  258 multiple schedule  260 behaviors  253
cumulative precommitment variable interval (VI)
recorder  258 strategy  269 schedule  259
delay discounting  267 Premack principle  279 variable ratio (VR)
demand curve  273 primary reinforcer  255 schedule  258
Chapter Outline
Categorization and Working memory  326
Discrimination 295 Reference memory  332
Trees, water, and Margaret  296 The Cognition of Time  335
Other categories  298 Time of day cues  335
How do they do it?  301 Interval timing  336
Basic Processes of Generalization How do they do it?  340
and Discrimination  305 The Cognition of Space  343
The generalization gradient  306 Cues that guide spatial behavior  343
Interactions between gradients  309 Spatial learning in the radial maze and
Perceptual learning  313 water maze  346
Mediated generalization and acquired How do they do it?  349
equivalence 317
Conclusion 320 Metacognition 355
How do they do it?  358
Another Look at the Information
Processing System  320 Summary 359
Visual perception in pigeons  321 Discussion Questions  360
Attention 325
Key Terms  361
chapter

1
8
How Stimuli Guide
Instrumental Action

C hapter 7 mentioned that a behavior almost always


has a bigger payoff in some situations than in others.
There is thus a clear value in being able to learn to
discriminate between different situations. That is what
the mechanisms of stimulus control are all about: They
allow us to learn to discriminate situations in which
behaviors are rewarded from situations in which they
are not. The classic demonstration of stimulus control
was Skinner’s rat pressing the lever in the presence
of a light (when the response was reinforced) and not
pressing the lever in the absence of the light (when
the response was not reinforced). The presence of the
light, a discriminative stimulus (SD), allowed the rat to
behave very efficiently.
It should not surprise you to learn that modern re-
search on stimulus control uses stimuli that are a bit
more interesting than light on and light off. For ex-
ample, Bond and Kamil (1998, 2002, 2006) studied the
behavior of blue jays in a Skinner box equipped with
a computer screen on one wall (Figure 8.1A). They
used the screen to project digital images of artificial
moths resting on a granular-looking background (Fig-
ure 8.1B). When the display contained a moth, pecks
at it were reinforced with a piece of mealworm, but if
a moth was not displayed, pecks were not reinforced.
(The bird had to peck the circle in the center to go
on to the next trial.) The birds learned to peck ac-
cordingly, which should not surprise you given what
you know about stimulus control and that blue jays
294  Chapter 8

(B) Parental population (P0)

(A)

Experimental results (F100)

Figure 8.1  Digital “moths” were shown to blue jays in experiments by Bond and Kamil (2002).
(A) Experimental setup. On any trial, a single moth or no moth at all was embedded in one of the
two granular backgrounds. (B) Top: A sample of moths from the population presented at the start
of the experiment (shown on a solid background to make them visible to the reader as well as on
the granular background actually used). Bottom: A sample of moths from the end of the experi-
ment. They are virtual offspring of moths that had evolved over many sessions in which the blue
jays could detect and eliminate moths from the population. Over sessions, moths from the parent
population became more cryptic and more variable in appearance. (A, courtesy of Alan Bond;
B, from Bond & Kamil, 2002.)

probably eat many moths sitting on trees in the wild. To state the obvi-
ous: The contingencies of reinforcement allowed the image of a moth to
become an SD for approaching and pecking. The psychological processes
behind stimulus control helped the blue jays forage efficiently for food.
Bond and Kamil (1998, 2002, 2006) actually demonstrated that stimulus
control processes can affect both a predator and its prey. When some of the
moths were more conspicuous than others, the blue jays tended to detect
and peck at the conspicuous ones and miss the more hidden (or “cryptic”)
moths. If the birds had actually been eating the moths, the conspicuous
moths would not have survived. In evolutionary terms, there is thus a
payoff for moths to be cryptic: Cryptic moths that escape detection are
more likely to have offspring in the next generation. Bond and Kamil rec-
How Stimuli Guide Instrumental Action   295

ognized this, and their experiments were actually designed to study a kind
of virtual evolution. Moths that the blue jays detected and pecked in one
session were eliminated from the population that was presented in the next
session. In contrast, moths that were not pecked (i.e., escaped detection)
survived and were represented in the next session by “offspring” that dif-
fered only a little from the parents. The blue jays’ behavior thus introduced
a bit of natural selection, and consequently, the moth population changed
in interesting ways (see Figure 8.1B, bottom). For one thing, the moths did
evolve to become a bit more cryptic. Perhaps more noticeably, the moths
gradually came to look different from one another. (Differences between
individuals of a species, or “polymorphisms,” are actually very common
in real moths.) The differences resulted from the blue jays’ ability to learn
about a particular moth over several trials and then search for a similar one
on the next trial (a phenomenon called “search image” that will be consid-
ered later in this chapter). Therefore, moths that were different from their
neighbors were more likely to survive than moths that were not different.
The psychology of stimulus control thus affected both the behavior of a
predator and the evolution of its prey.
It is hard to overstate the importance of stimulus control. My goal in
this chapter is therefore to reach a better understanding of it. We will begin
by looking at how animals discriminate and respond to complex sets of
stimuli in a way that suggests that they can learn categories or “concepts.”
The discussion will lead us to consider a number of basic processes behind
the psychology of stimulus control. We will then see how the familiar in-
formation processing system (the so-called standard model of cognition)
contributes to it all. In the final parts of this chapter, we will ask how or-
ganisms use time, spatial cues, and perhaps even knowledge of their own
cognitive states to optimize their interactions with the world. Throughout,
we will see that it has become common to invoke cognitive processes (psy-
chological processes that are not directly observed in behavior) to explain
behavior under stimulus control. At the same time, researchers have also
recognized that methods for studying stimulus control provide a power-
ful way to study those processes. Modern research on stimulus control is
thus part of a research area that is sometimes known as animal cognition,
or the study of cognitive processes in animals (e.g., Pearce, 2008; Roberts,
1998; Roitblat, 1987; Zentall & Wasserman, 2012). We have learned a lot
about cognition and behavior from experiments on the control of operant
behavior by complex discriminative stimuli (e.g., Fetterman, 1996).

Categorization and Discrimination


Let us return to the pigeon, the favorite participant in experiments on op-
erant behavior since Skinner first used them in the 1940s and 1950s. One
reason the pigeon is a popular subject is that its eyesight is very good.
Pigeons have color vision, for example, and they can discriminate between
complex and colorful stimuli. Nowadays a typical pigeon box (like the
296  Chapter 8

blue jay box, just described) is outfitted with a computer screen, which
allows enormous flexibility in the type of image that can be presented as
a potential SD.
Trees, water, and Margaret
Herrnstein, Loveland, and Cable (1976) had a pigeon box with a screen
that allowed them to show the birds a long series of color slides. The birds
viewed each slide for about 30 seconds. In the presence of half the slides,
pecking the key was reinforced. In the presence of the other slides, pecking
was not reinforced. The logic of the experiment is again like Skinner’s rats.
In this case, a bird should have learned to peck only when a positive slide
(the SD) was showing on the screen.
The birds did exactly that, but what made the experiment interest-
ing was the pictures that actually made up the slide show. Each bird was
trained and tested with about 1,700 slides (Figure 8.2). For some birds, the
slides that signaled that pecking would be reinforced contained an image
of a tree, and slides that signaled no food did not contain a tree. The slides
were taken during all four seasons in New England. Thus, a “positive” tree
picture was sometimes a shot of a tree in the summer, sometimes in the
winter, or sometimes in the fall. The tree could be the focus of a scene, or
it could be an extra item in the background. Many slides contained only
parts of trees, and the trees were photographed from many angles and
vantage points. Even though the tree images were so variable, the birds
learned to discriminate tree slides from other slides (pecking more during
tree slides than “non-tree” slides) over many sessions of training. Just as
important, when the birds were tested with new slides that they had never
seen before, they also responded mainly if the new slide contained a tree.
The pigeons had not just memorized particular pictures—they generalized
to new examples, too. The birds had learned to categorize the pictures as
those that contained trees and those that did not.
Other pigeons in the Herrnstein et al. (1976) experiment were trained
and tested with other categories. Some pigeons learned to peck in the
presence of pictures that contained images of water. These birds also saw
hundreds of slides, some of which contained images of water (indoors as
well as outdoors, during any season) and some of which did not. Only
pictures with water signaled that pecking would be reinforced. The birds
learned to categorize accurately, and they also responded accurately to new
pictures. A final group of birds learned about a specific human female. Here
again there were hundreds of pictures, half of which contained Margaret
(inside, outside, dressed for any season, and with or without other people)
and half of which did not. The birds learned to peck only in the presence
of slides containing Margaret and not slides containing other humans or
no human. Once again, the pigeons accurately responded to new pictures
that they had never seen before.
The pigeons’ feat is impressive because whatever defined the tree,
water, and Margaret categories was fuzzy and hard to describe. They were
How Stimuli Guide Instrumental Action   297

Negative Positive Figure 8.2  Examples


of test slides from the
tree, water, and Margaret
categories in the study by
Herrnstein et al. (1976).
Pigeons correctly respond-
ed in the presence of the
positive slides and did not
respond in the presence of
Trees
the negative slides. (From
Herrnstein et al., 1976.)

Water

Margaret
298  Chapter 8

“polymorphous” categories in the sense that no single feature was neces-


sary or sufficient to identify them. Trees can be green, orange, brown, or
full of flowers; they can have trunks that are brown, gray, white, and so
forth. New trees will have some, but not all, of these different features.
Many non-tree slides, such as a stalk of celery, will have some of the same
features, too. The categories are complex and open-ended, which makes
them very different from “light on versus light off.” When you think about
it, however, polymorphous stimulus sets are probably very common in the
natural world because the stimuli that we respond to are inherently variable
from trial to trial or over time. Many animals presumably learn to identify
many stimuli in their natural environments from different vantage points.
It should not surprise us that learning and perception processes can make
sense of variable input.
Other categories
Pigeons have been taught to categorize an astounding range of stimuli in
experiments using similar methods. In the first paper ever published on
categorization in pigeons (to my knowledge), Herrnstein and Loveland
(1964) showed that pigeons could discriminate pictures containing people
from pictures that did not. Later experiments (Herrnstein & de Villiers,
1980) used underwater pictures shot on coral reefs that either contained
images of fish or did not. The birds had no trouble learning to discriminate
between these types of slides, which seemed unnatural in the sense that no
living pigeon—or ancestor of one—had ever needed to learn about tropical
fish! Other experiments have shown that pigeons can discriminate between
animals versus non-animals and kingfishers versus other birds (Roberts
& Mazmanian, 1988), letters of the alphabet (e.g., Lea & Ryan, 1990), and
moving video clips of digitally created images of animals shown running
versus walking (Asen & Cook, 2012). You may remember another classic
experiment that was described at the beginning of Chapter 1 in which pi-
geons learned to discriminate between color images of paintings by Monet
and Picasso (Watanabe et al., 1995). In that case, Monet-trained birds gen-
eralized to new images by Cezanne and Renoir, whereas Picasso-trained
birds generalized to Braque and Matisse.
Ed Wasserman and his collaborators at the University of Iowa have run
many experiments in which pigeons learn to discriminate between four
categories at the same time (e.g., Bhatt, Wasserman, Reynolds, & Knauss,
1988; Wasserman & Bhatt, 1992; Wasserman, Brooks, & Murray, 2015; Was-
serman, Kiedinger, & Bhatt, 1988). In a typical experiment, pigeons are
shown images on a small computer screen and then are required to peck
one of four keys located near the corners of the screen (Figure 8.3). The
picture might contain one of several examples of cars, chairs, flowers, or
Ed Wasserman
cats (or sometimes people, as in Figure 8.3). To receive a reinforcer, the
pigeon must peck one of the keys, depending on whether the image on
the screen is a cat, car, flower, or chair. The birds thus essentially report the
“name” of the image by pecking a particular key. The birds learn this task
How Stimuli Guide Instrumental Action   299

(A) (B) Cars

Chairs

Flowers

Figure 8.3  (A) Apparatus and


(B) examples of positive images
from the cars, chairs, flowers, and
people categories in categoriza-
tion experiments by Bhatt et
al. (1988). Pecking one key was
reinforced in the presence of an People
image from each of the catego-
ries. (A, from Wasserman, 1995;
B, from http://www.pigeon.psy.
tufts.edu/avc/urcuioli/Bhatt_
frame.htm.)

rapidly and accurately (Figure 8.4A). They also seem to do equally well
with categories of human-made things (e.g., cars and chairs) or natural
things (e.g., cats and flowers). It is not that pigeons cannot tell the differ-
ence between different category members: If they are required to peck one
key for some cats and another key for others, the birds can learn that, too.
Thus, the pictures that make up each category appear to be similar but still
discriminably different to the birds.
When pigeons are tested with new images they have never seen be-
fore—new exemplars of the cat, car, flower, and chair categories—they
respond well above chance, although performance is a bit worse than with
300  Chapter 8

Figure 8.4  Results of an ex- (A) Acquisition (B) Transfer test


periment in which pigeons learned
80 90
four categories (cars, chairs, flow-
ers, and cats). (A) Acquisition.
80
(B) “Transfer” tests in which the 70
birds were shown new slides (as
70
well as old slides they had previ-
ously been trained with). Although 60 60
accuracy declined a little with the

Correct choice (%)

Correct choice (%)


new slides, it was still well above
50
chance (25%). (After Bhatt et al., 50
1988.)
40

40
30

20
30

10
20
0
1 2 3 4 5 6 Old New
5-session blocks

images the birds have specifically been trained with (see Figure 8.4B).
Transfer to new stimuli can be impressive. In one of my favorite experi-
ments of this series, the birds were trained with a very large set of pictures
that were never repeated. From the beginning, every time a picture of, say,
a car appeared, it was a picture that the birds had never seen before. (To
be properly cautious, the experimenters used pictures of people instead
of cats on the grounds that cats, but not people, tend to look too much
alike—although they were being a bit anthropocentric in hindsight.) Even
when exemplars were never repeated, the pigeons rapidly learned to dis-
criminate between cars, chairs, flowers, and people.
In another favorite experiment, Wasserman et al. (2015) trained pigeons
to discriminate between color pictures from 16 different categories (baby,
bottle, cake, car, cracker, dog, duck, fish, flower, hat, key, pen, phone, plane,
shoe, and tree). Such a large number of categories required a slightly differ-
ent method. In each training session, the birds were reinforced for pecking
a distinct image that was associated with each category. On each trial, a
color picture was presented at the center of the screen, and the birds were
reinforced for pecking at an image that “labeled” the category instead of a
randomly chosen image that labeled another one. Using this method, the
birds learned to label pictures that were examples of each of the 16 catego-
ries and later showed good performance when tested with new pictures
that they had never seen before.
The ability to respond correctly to new stimuli is a crucial part of
studies of categorization. One reason it is important to test the animals
How Stimuli Guide Instrumental Action   301

95 Figure 8.5  Categorization after training with


Image categories of different sizes (1, 4, or 12 different
85 Old pictures in each category). Although larger cat-
New egories were more difficult to learn than smaller
75 categories (see responding to “old” slides), they
led to more accurate categorization when tested
Correct choice (%)

with new slides. (After Wasserman & Bhatt,


65
1992; data from Bhatt, 1988.)

55

45

35

25
1 4 12
Category size

with novel images (called transfer tests) after training is that pigeons
have excellent memories for specific photographs. Based on experiments
by Vaughan and Green (1984), investigators have long suspected that
pigeons can remember 320 different photographs; a later experiment sug-
gests that they have a capacity to remember over 800 (Cook, Levison, Gil-
lett, & Blaisdell, 2005). Therefore, if experimenters do not test transfer to
new images, it is possible that the birds have merely learned to respond
to each specific training slide. It is interesting to note that to make the
categorization of new stimuli more accurate, it is best if the bird has
experience with many different examples during training. For example,
during training, Bhatt (1988; described in Wasserman & Bhatt, 1992) gave
different groups of pigeons either one example of each category (a single
cat, car, flower, or chair set the occasion for pecks to the different keys),
four examples of each category, or twelve examples of each category.
Not surprisingly, the task involving only one example of each category
was the easiest for the pigeons to learn. As Figure 8.5 shows, however,
transfer to new pictures, after training with only one example of each
type, was rather bad (25% is no better than chance because one-fourth of
the guesses would have been correct). The more examples used in train-
ing, the better was the transfer to new images (see Figure 8.5). This result
illustrates an important point: If your goal is to increase performance in
new situations, the more examples in training the better, even though the
learning may be a little difficult.
How do they do it?
The experiments just described provide nice demonstrations of catego-
rization in pigeons, but they do not tell us much about how the pigeons
302  Chapter 8

accomplish this feat. You might be surprised to know that theories we have
considered in previous chapters do extremely well here.
For example, one approach to categorization is called feature theory.
This approach accepts the idea that all pictures contain many different
features and that the bird must essentially learn which features signal
reinforcement and which features do not. This is a kind of analysis that
we saw as early as Chapter 1, and the Rescorla-Wagner model (as well
as other theories described in Chapters 4, 5, and 6) readily applies. For
example, consider a pigeon learning to discriminate images of cats from
non-cats. One picture of a cat might contain several features, such as
whiskers, ears, white paws, and a tail. The picture is therefore a com-
pound stimulus, made up of a large number of elements that might not
be that different from simpler compound stimuli that are made up of
tones and lights. When the compound is paired with a reinforcer, each
feature might be associated with it a little; the associative strength of each
increasing a bit with each trial. On a second trial, the slide might contain
such features as whiskers, a tabby face, and gray paws. The associative
strength of these cues would likewise increase. On the next trial— a nega-
tive trial—the slide might show a dog rather than a cat. Here, we might
have another set of features, say, whiskers, floppy ears, brown color,
and a tail. Each of these cues would receive a decrement in associative
strength and possibly some inhibition. In the long run, after a number of
such trials, several features would have associative strength. Some might
be negative (inhibitory), some might be positive (excitatory), and many
redundant or irrelevant features might have little associative strength.
If a brand-new cat or dog image were then presented, the system would
detect its features, add their associative strengths, and then respond as
a function of the sum of their strengths. In this way, a feature theory
like the Rescorla-Wagner model could explain the bird’s categorization
performance. Given what you already know about the Rescorla-Wagner
model and the related connectionist theories described in Chapters 1 and
6 (e.g., McClelland & Rumelhart, 1985), you can see how it can all come
together. Theories of conditioning are relevant here.
To test this kind of process, it has been necessary to present stimuli
with features that the experimenter can manipulate easily. Photographs
of cats, cars, trees, and Margaret contain features that are just too dif-
ficult to analyze. So Castro and Wasserman (2014), for example, studied
pigeons that learned about artificial categories. They showed pigeons a
series of displays on a touchscreen; the displays varied from trial to trial
(Figure 8.6). Each display had a distinctive feature at each corner, but
the features were presented in different corners from trial to trial. Two
features always indicated that the display was a member of Category A,
and two other features indicated that it was from Category B. (Can you
identify the predictive features in Figure 8.6?) The remaining features
varied from trial to trial and were not relevant for predicting the category
(they occurred equally often on Category A and Category B displays).
How Stimuli Guide Instrumental Action   303

Category A Category B Figure 8.6  Examples of displays


used in categorization experiments
by Castro and Wasserman (2014).
Two of the features in the corners
always identified a display as a mem-
ber of Category A, and two identi-
fied it as a member of Category B.
Can you find the predictive features?
(After Castro & Wasserman, 2014.)
304  Chapter 8

On each trial, a display was presented, and the pigeon had to peck the
display (anywhere) a few times. Then two response buttons appeared
on the touchscreen; pecks to Button A were reinforced if the display was
from Category A, and pecks to Button B were reinforced if it was from
Category B. The birds learned to categorize the displays very well (by
pecking at A or B). The interesting result, however, was that they also
directed their early pecks at the relevant features in the corners, especially
on trials when their categorization choices (A or B) were correct. That is
exactly what you would predict if the birds were directing their attention
to the predictive features (e.g., Mackintosh, 1975a; Pearce & Mackintosh,
2010) or if those features were acquiring lots of associative strength (e.g.,
Rescorla & Wagner, 1972; Wagner, 2003).
Huber and Lenz (1993) showed pigeons images of faces like the ones
shown in Figure 8.7. The faces actually varied in four ways: the size of the
forehead, the space between the eyes, the length of the nose, and the size
of the chin below the mouth. As the pictures in Figure 8.7 illustrate, each of
these dimensions could have one of three different values (i.e., there were
three forehead sizes, three widths of gaps between the eyes, three lengths
of noses, and three chin sizes). Moving from left to right in Figure 8.7, each
feature shown was given an arbitrary value of –1, 0, and +1. On many trials,
Huber and Lenz then showed the birds pictures of a number of faces made
up of different combinations of these features. Pecking was reinforced if
the face contained features with values that summed to more than zero.
(Of the three faces shown, only Figure 8.7C would have been reinforced.)
The birds learned the discrimination. More importantly, their rate of peck-
ing was a simple function of the sum of the features’ values. The pigeons
appeared to learn about the relevant features and then respond according
to their value. The feature-learning perspective is consistent with many
findings in the categorization literature (for other examples, see Jitsumori
& Yoshihara, 1997; Loidolt, Aust, Meran, & Huber, 2003).
With some sets of stimuli, however, pigeons appear to respond as
if they have learned a sort of prototype of the stimuli in the category
(Huber & Lenz, 1996; Jitsumori, 1996; von Fersen & Lea, 1990). In this case,

(A) (B) (C)

Figure 8.7  Faces used in the


categorization experiment by
Huber and Lenz (1993). (After
Huber & Lenz, 1993.)
How Stimuli Guide Instrumental Action   305

the birds responded to stimuli to the extent that the stimuli were similar
to a kind of average cat or chair rather than the sums of features. Accord-
ing to prototype theory, exposure to the different types of trials results
in the formation of just such a prototype (e.g., Posner & Keele, 1968). To
put it simply, pigeons in the early Wasserman experiments might have
formed a representation of the average cat, average car, average chair,
and average flower and then responded to new examples depending on
how similar they were to one of the prototypes. One problem with this
kind of approach is that results suggesting the learning of a prototype can
often be explained by a feature theory (see Pearce, 1994). For example,
in Chapter 1, we saw that a connectionist network (using the Rescorla-
Wagner equation) can learn what looks like the prototype of a “dog” or
“cat” even though the subject is merely associating different features. The
idea is that the most predictive features become associated in a kind of
net, analogous to a prototype, and with enough common features, one
gets responding as if the animal was generalizing from a prototype (see
also Mackintosh, 1995).
A third possible way that animals might learn to categorize is the sim-
plest way of all. According to exemplar theory, a bird might learn and
remember each picture or image presented to it in the training phase. When
novel stimuli are tested, the bird might then generalize from all these ex-
emplars and respond to the extent that the new picture is similar to a
picture that was reinforced before. The approach assumes that animals
can remember a very large number of pictures. That is not impossible; as
mentioned before, pigeons can remember a very large number of indi-
vidual slides (Cook et al., 2005; Vaughan & Green, 1984). Furthermore, the
exemplar approach has been very successful in understanding categoriza-
tion in humans (e.g., Kruschke, 1992; Medin & Schaffer, 1978; Nosofsky,
1987). The idea that animals learn about whole configurations of stimuli
and then respond to new stimuli, according to how well they generalize
to them, is also consistent with the Pearce (1987, 1994) configural theory
we encountered in Chapters 4 and 5.
Research on categorization in pigeons continues, but at this point it
should be clear that this fascinating example of animal learning is not mi-
raculous. It almost certainly involves learning processes that are similar to
the ones studied in experiments on classical conditioning. This is not really
accidental. We have always studied classical conditioning in part because
we think that its principles will apply to other, more complex-looking ex-
amples of learning—like categorization.

Basic Processes of Generalization and Discrimination


Research on categorization is part of a long tradition in learning theory bent
on figuring out how we generalize and discriminate between stimuli. That
tradition has uncovered a number of basic facts and tools that will help us
understand more complex forms of learning and behavior.
306  Chapter 8

The generalization gradient


First, as implied above (and also mentioned in previous chapters), behavior
that has been reinforced in the presence of one stimulus generalizes quite
lawfully to similar stimuli. In a classic pigeon experiment, Guttman and
Kalish (1956) reinforced pecking at a simple key that was illuminated with
a pure color (e.g., a yellowish-orange color with a wavelength of 580 nm).
After training for several days, the birds were tested with trials in which
the color of the key light was varied in many steps between green (520 nm),
yellow, orange, and finally into red (640 nm). The tests were conducted in
extinction, which means that none of these new test colors was ever rein-
forced, and there was thus no payoff to respond to any of them. The birds
nonetheless pecked at the new key colors if their wavelength was similar
enough to the original 580-nm value. The results are shown in Figure 8.8.
This is a well-known example of a stimulus generalization gradient—re-
sponding to a new stimulus depends on its similarity to a stimulus that has
already been reinforced. Generalization gradients have been shown for a
wide range of stimuli. For example, more recent experiments have shown
similar gradients when pigeons are first trained to peck in the presence
of computer images of three-dimensional objects and the objects are then
rotated in virtual space (e.g., Cook & Katz, 1999; Peissig, Young, Wasser-
man, & Biederman, 2000; Spetch & Friedman, 2003).
Notice that if the birds had not learned about the color of the key, they
might have responded to any key light, regardless of its color. Thus, the
steepness of the generalization gradient—the rate at which responding
declines as the stimulus is changed—indicates how much the responding

350
580 nm (S+)
300

250
Responses

200

150

100

50

0
530 550 570 590 610 630 650
Wavelength (nm)

Figure 8.8  A stimulus generalization gradient.


Responding by pigeons to a key illuminated by
different colors (wavelengths in nanometers)
after being reinforced for pecking a key of 580 500 550 600 650
nm. (After Guttman & Kalish, 1956.) Wavelength (nm)
How Stimuli Guide Instrumental Action   307

actually depends on a particular stimulus dimension. One of the major


points of research on stimulus control is that the shape of the gradient is
not automatically controlled by cues that are present when an operant
is reinforced. Instead, the shape of the gradient can be affected by learning.
Often, another stimulus has to be nonreinforced; that is, the S D must be
discriminated from an SΔ (see Chapter 7). For example, Jenkins and Harri-
son (1960, 1962) reinforced pigeons for pecking a key that was white when
the key was illuminated on each trial. The experiments manipulated the
presence of a 1000-Hz tone that could be sounded in the background. In
one condition, pecking the key was reinforced if the tone was on but was
not reinforced if the tone was off. As you can imagine, the birds eventually
responded mainly when the tone was on. In another condition, the birds
learned to discriminate between two tones: Pecking was reinforced when
the 1000-Hz tone was on, but pecking was not reinforced when a slightly
lower tone (950 Hz) was on. Here again, you can imagine that the birds
eventually confined their responding to when the 1000-Hz tone was on.
In a final condition, the birds received no discrimination training. The
1000-Hz tone was always on, and the birds were reinforced whenever
they pecked the illuminated key.
The crucial results are from a final generalization test in which the birds
were tested for pecking in the presence of tones of other frequencies; test
results are summarized in Figure 8.9. The steepness of the gradient de-
pended on the previous training. When the birds had not received explicit
discrimination training, the generalization gradient was flat, indicating
equal responding to all the tested tones; that is, there was no evidence of
stimulus control by the tone in this situation (see Rudolph & van Houten,
1977, for an extension). In contrast, when the birds had been reinforced
while the 1000-Hz tone was on but not reinforced when the tone was off,
there was a lovely stimulus generalization gradient. The sharpest gradient

50
S+ = 1000 Hz, S – = No tone
Figure 8.9  The sharpness of the
generalization gradient depends
Percentage of total responses

40
on the type of training. When
pigeons had to discriminate
30 between a 1000-Hz tone S+ and
S+ = 1000 Hz, S – = 950 Hz a 950-Hz tone S–, the gradient
was much sharper than when the
20 pigeons discriminated between
Control (no discrimination)
the 1000-Hz tone (S+) and silence
(S–). The control group received
10 no discrimination training with
the tone before the test, and they
showed little stimulus control by
0
300 450 670 1000 1500 2250 3500 the tone. (After Jenkins & Har-
Tone frequency (Hz) rison, 1960, 1962.)
308  Chapter 8

of all was produced by the group that had learned to discriminate 1000 Hz
from 950 Hz. These birds showed a maximum of responding near 1000 Hz
and virtually nothing to other stimuli. Thus, stimulus control is powerfully
influenced by discrimination training.
Why should that be true? You can probably imagine that the birds had
learned to pay attention to the tone when they had to discriminate it from
silence or a tone of another frequency. We discussed the role of attention in
learning about signals for reinforcement in Chapter 4 when we considered
models of classical conditioning (e.g., Mackintosh, 1975a; Pearce & Hall,
1980). Notice that the context was always present when the birds were rein-
forced in the tone’s presence but not in its absence. We can therefore think
of this situation as one in which the tone—plus context—was reinforced,
but the context alone was not (TC+, C– training, where T is the tone and
C is the context). Mackintosh’s theory predicts that attention will be paid
to the relevant predictor. Alternatively, the birds might learn that the tone
is the main predictor of the reinforcer, as is implied by models like the
Rescorla-Wagner model.
Another process that can affect the steepness of the generalization gradi-
ent is inhibition, which might have developed when Jenkins and Harrison
nonreinforced the 950-Hz S–. Honig, Boneau, Burstein, and Pennypacker
(1963) ran an experiment that demonstrated the importance of inhibition in
this sort of situation. The experiment included two groups of pigeons that
were taught to discriminate a plain white key from a white key with a verti-
cal black bar presented on it (Figure 8.10). For one of the groups, pecking
was reinforced when the vertical black bar was present on the white key,
but pecking was not reinforced when the bar was absent. The other group
received the opposite treatment: Pecking was reinforced when the key was
merely white and was not reinforced when the vertical black bar was present.

400
Figure 8.10  Excitatory and inhibitory
generalization gradients with line-tilt
stimuli. For one group of pigeons, a S+ =
300 S– =
vertical black bar on a white key was
Mean total responses

S+, and the white key without the


bar was S–. This group showed the
familiar gradient when lines of dif- 200
ferent tilts were tested. For another
group of pigeons, the white key
without the vertical black bar was
S+, and the white key with the bar 100
S+ =
was S–. For this group, the vertical S– =
black bar suppressed responding;
this inhibitory effect decreased as
the tilt was changed. (After Honig 0 30 60 90 120 150 180
et al., 1963.) Degrees of tilt
How Stimuli Guide Instrumental Action   309

Not surprisingly, both groups learned to peck when appropriate. What was
interesting again, however, was the responding that emerged during a final
generalization test. Both groups were now presented with the usual verti-
cal black bar on the key, but on different trials it was presented at different
angles—deviating from vertical to horizontal. The results are summarized
in Figure 8.10. Notice that the group that had been trained to peck in the
presence of the vertical black bar showed another gradient like the ones that
we have seen in previous figures. There was a clear peak with vertical, and
then responding declined lawfully as the angle of the bar changed.
The new result was the one with the group for which the vertical black
bar had signaled no reinforcement. In this case, the birds responded little
to the vertical black bar, but pecking increased systematically as the angle
of the bar differed. In this group, the vertical black bar was actively sup-
pressing the pecking response. This result implies that inhibition develops
to a stimulus that signals nonreinforcement of an operant response. Again,
this is not necessarily surprising based on what you already know. No-
tice that this group received reinforcement with the white key (W+) and
nonreinforcement when the vertical black bar was added to the white key
(BW–). This procedure is another case of feature-negative discrimination
learning (see Chapters 5 and 6).
One point worth making is that this line of research came out of the Skin-
nerian tradition, where inferring unobserved processes is somewhat frowned
upon. In the end, investigators came to see the value of inferring processes
like inhibition and attention—processes that were only indirectly observable.
Interactions between gradients
A role for inhibition in discrimination learning had been anticipated in ear-
lier papers by Kenneth Spence (e.g., Spence, 1936). He suggested that an S+
will receive some excitatory conditioning when it is reinforced and that the
excitation will generalize from it to other similar stimuli. In a similar way,
S– will receive some inhibitory conditioning when it is nonreinforced, and
this inhibitory conditioning will also be ready to generalize. The gradients
shown in Figure 8.10 seem nicely consistent with Spence’s theory—as well
as with the many conditioning theories that we studied in Chapters 4 and
5, which trace their origins to Spence’s approach.
Kenneth Spence
Spence’s theory generates some other interesting predictions that are
worth considering, though. Figure 8.11 shows the theoretical result of train-
ing a discrimination between two very similar stimuli. Reinforcement of a
550-nm key light will condition excitation to 550 nm, and this excitation
generalizes to similar stimuli. Nonreinforcement of a similar key light color,
570 nm, conditions inhibition to it, which likewise generalizes. This is not sur-
prising, right? But another implication is interesting. Suppose that we were
to test responding to all the stimuli surrounding S+ and S–. Spence assumed
that responding to the new stimuli would depend on how much inhibition
and excitation generalize to the new stimuli from S+ and S–. Inhibition (with
its negative value) would subtract from excitation. Figure 8.11 shows the net
310  Chapter 8

Figure 8.11  Hypothetical excitatory and inhibi- Generalization of


tory gradients after discrimination training with excitation around
a 550-nm key light as S+ and a 570-nm key S+ = 550
light as S–. During generalization tests with key
lights of other colors, the predicted level of Predicted
responding

Excitation
responding (blue circles) is simply the difference
between the amount of excitation and inhibi-
tion that generalizes there. Notice that general-
ization of inhibition causes the peak of respond-
ing to move away from S–; in this example, it
would be at 540 nm rather than 550 nm, a peak
shift. (After Rilling, 1977.)

480 500 520 540 560 580 600 620

Inhibition

Generalization of
inhibition around
S – = 570

480 500 520 540 560 580 600 620


Wavelength (nm)

generalization that would lead to predicted responding. Notice something


rather subtle and interesting. Because of the inhibition generalizing from S–,
the theory does not predict the most responding at the trained S+. Instead, it
predicts even more responding with stimuli that are farther away from S–!
That is, the peak of the generalization gradient should shift away from S–,
to a stimulus (540 nm) that was never actually reinforced.
This prediction was nicely confirmed in several experiments that were
conducted many years after Spence’s theory was published. The most fa-
mous is one reported by Hanson (1959), who was working in Guttman and
Kalish’s lab. One of Hanson’s groups of pigeons received a treatment very
similar to the one just described. These pigeons were reinforced for pecking
a 550-nm key light and were not reinforced for pecking an extremely similar
555-nm key light. A second group also received discrimination training
except that the nonreinforced S– was a distant 590 nm. A third group—a
control—simply received reinforcement with 550 nm. Generalization was
then tested by giving the birds the chance to respond to key lights of dif-
ferent wavelengths. The results are shown in Figure 8.12. These results are
justly famous—it is clear that the groups given the discrimination training
showed more responding to 540 nm than to 550 nm. This was especially
true of pigeons given the 555-nm S–, which should have generalized most
How Stimuli Guide Instrumental Action   311

S+ Figure 8.12  Peak shift. Three


500
groups of pigeons were rein-
Control
forced for responding in the
S – = 590 nm
400 S – = 555 nm presence of a 550-nm key light.
Number of responses

One group was nonreinforced


when presented with 555 nm,
300
and another group was nonrein-
forced when presented with 590
200 nm. A control group received
reinforcement with 550 nm,
without any nonreinforced trials.
100
The peak of responding for the
two experimental groups shifted
0 away from S–. (After Hanson,
480 500 520 540 560 580 600 620
1959.)
Wavelength (nm)

strongly to 550 nm. This phenomenon, in which the peak of the generaliza-
tion gradient moves away from S–, is called peak shift. It has been shown
in both operant and classical conditioning (Weiss & Weissman, 1992) and
with a variety of stimuli, including intensity of the stimulus (e.g., Ernst,
Engberg, & Thomas, 1971), its spatial position on the computer screen
(Cheng, Spetch, & Johnston, 1997), and even the tilt of the Skinner box floor
(Riccio, Urda, & Thomas, 1966). It has even been shown in a categorization
task (Mackintosh, 1995).
Peak shift is an important phenomenon for several reasons. It suggests
that the highest level of responding can somewhat paradoxically occur to
stimuli that have never been directly reinforced. It suggests the possible role
of inhibition. And, like other aspects of the psychology of stimulus control,
it might have implications for evolution. For instance, in a discussion of
beak-color preferences in female zebra finches, Weary, Guilford, and Weis-
man (1992) suggested that peak shift might play a role in explaining why
elaborate characteristics and extreme coloration might evolve in members
of one sex. Female zebra finches tend to have orange beaks, but they prefer
to mate with males that have bright red beaks. This preference is at least
partly learned; it depends on the female’s early experience with males with
red beaks (e.g., Weisman, Schackleton, Ratcliffe, Weary, & Boag, 1994). Dur-
ing that experience, if a male’s red beak is associated with reinforcement
(perhaps sexual in nature) but an orange female’s is not, females might
prefer to mate with males possessing extremely red beaks that differ even
more from the female’s orange S–. If this speculation is true, males with
more extreme-colored beaks would have more offspring in the future. Peak
shift in the female’s mating preference would yield something like “beak
shift” in the evolution of the male’s beak coloration.
Spence’s ideas about the interactions of gradients around S+ and S–
have at least one other implication. Suppose that we train an animal so that
a relatively bright stimulus is reinforced (S+) and a relatively dark stimulus
312  Chapter 8

Response strength
Excitatory
Inhibitory

Sʹ S+ S–
Brighter Darker

Figure 8.13  Transposition. Hypothetical excitatory (red line) and inhibitory (blue
line) generalization gradients after discrimination training with a relatively bright
S+ and a darker S–. If given a choice between a bright (S+) and an even bright-
er stimulus (S’), the animal might choose the brighter S’. That result might occur
because of the larger difference between the excitation and inhibition general-
izing there and not because the animal has learned a relationship between the
stimuli (“always choose the brighter stimulus”).

is not (S–). Doing so could lead to gradients of excitation and inhibition


around S+ and S–, as illustrated in Figure 8.13. As before, the strength of
responding to a given stimulus will depend on the difference in the amount
of excitation and inhibition that generalize to it. If given a choice between
S+ and S–, the animal will therefore prefer S+. But now suppose that the
animal is given a choice between S+ and an even brighter stimulus, S’. The
gradients in Figure 8.12 suggest that, in this case, the animal should prefer
S’ over S+; that is, it should choose the even brighter stimulus. Such a result
has been shown in a number of experiments (e.g., Honig, 1962; Lazareva,
Wasserman, & Young, 2005) and is called transposition. Transposition
seems to suggest that the animal responds to the relationship between two
stimuli (i.e., their relative brightness) rather than their absolute properties.
Yet, the power of Spence’s analysis is that it explains transposition without
supposing the animal learns about stimulus relationships. One problem
for Spence is that transposition can still occur under conditions that can
only be explained by relational learning. For example, Gonzalez, Gentry,
and Bitterman (1954) trained chimps to respond to the middle stimulus of
three stimuli that differed in size. When they tested the chimps with three
altogether new stimuli (also varying in size), the chimps again chose the
middle stimulus, even though Spence would predict a choice of the extreme
stimulus closest to the original S+. One conclusion is that animals do some-
times learn about the relationships between stimuli (see also Lazareva et al.,
2005). A similar interpretation has been provided for the peak shift effect in
humans (e.g., Thomas, Mood, Morrison, & Wiertelak, 1991), although the
role of relational learning here may be more consistent with the behavior
of humans than that of other animals (e.g., Thomas & Barker, 1964).
To summarize, organisms generalize lawfully from one stimulus to other
stimuli that are similar. This process interacts with discrimination learning,
How Stimuli Guide Instrumental Action   313

though. The peak shift phenomenon suggests that generalization gradients


can influence the results of explicit discrimination training. At the same time,
discrimination learning can influence the extent to which animals generalize
from stimulus to stimulus. One point is that discrimination and generaliza-
tion processes are always interacting. Another is that theories of associative
learning that emphasize inhibition, excitation, and generalization go a sur-
prising distance in explaining stimulus control phenomena.
Perceptual learning
Discriminations can often be learned between impressively similar stim-
uli. For example, wine connoisseurs eventually make fine discriminations
between different wines and vineyards, bird watchers make subtle dis-
criminations between different birds (Figure 8.14), and musicians become
exquisitely expert at discriminating different musical intervals and chords.
It is possible, of course, that such fine discriminations can result from dif-

Figure 8.14  Experts can make


astonishingly subtle discrimina-
tions between similar stimuli.
You may recognize these birds
as wrens, but only a seasoned
bird-watcher can discrimi-
nate the Bewick’s Wren, the
Carolina Wren, and the Marsh
Wren among them. (Plate from
A Field Guide to the Birds of
Eastern and Central North
America, 5th Edition by Roger
Tory Peterson. Illustrations
© 2002 by Marital Trust B u/a
Roger Tory Peterson. Reprint-
ed by permission of Houghton
Mifflin Company.)
314  Chapter 8

ferential reinforcement of the different stimuli (following the principles


we have been discussing). But there is also evidence that discriminations
can be further facilitated by mere exposure to the similar stimuli—without
any differential reinforcement.
In a classic experiment, Gibson and Walk (1956) hung both a large metal
rectangle and a triangle on the wall of the home cage of a group of rats for
several weeks. Compared to rats that had no exposure to the rectangle and
triangle during this period, the rats exposed to these cues were quicker to
discriminate between them when one shape was eventually used as an
S+ and the other shape as an S– in a discrimination experiment. Simple
exposure somehow made the two different shapes easier to discriminate,
a phenomenon known as perceptual learning. Because neither stimulus
was reinforced in the initial phase, the rules for discrimination learning we
have been considering do not easily apply; there would need to be some
differential reinforcement of one stimulus over the other to achieve dif-
ferences in excitation or inhibition. One idea was that exposure somehow
led the animals to differentiate the stimuli (Gibson & Gibson, 1955), but
the mechanisms surrounding the idea were vague (but see Saksida, 1999).
Theories of conditioning seem even further challenged by perceptual
learning because they usually predict that nonreinforced exposure to a
stimulus should reduce its conditionability—that is, it should cause latent
inhibition. Notice, however, that Gibson and Walk (1956) were not inter-
ested in rate of conditioning, per se. Rather, they were looking at the rate at
which two stimuli were discriminated, not the rate of conditioning to the
stimuli individually. In principle, the rate of learning could be quite slow,
but the difference in responding to S+ and S– might
develop relatively quickly.
All of this was recognized in an interesting ap-
proach suggested by Ian McLaren and Nicholas Mack-
intosh (McLaren, Kaye, & Mackintosh, 1989; McLaren
a x b & Mackintosh, 2000; see also Hall, 1991, 1996). McLar-
en and Mackintosh begin with the familiar assumption
that any stimulus is made up of many features or ele-
ments. Organisms will generalize between two stimuli
A B
and thus have trouble discriminating between them,
Figure 8.15  Two similar stimuli to the extent that they share common features. For ex-
(A and B, represented by circles) ample, the circles in Figure 8.15 represent the sets of
are theoretically composed of elements that are contained in each of two stimuli, A
unique elements (regions a and
and B. Each stimulus has some unique elements (a and
b) and also common, overlapping
elements (region x). This perspec-
b) and some common, overlapping, elements (x). If one
tive has interesting implications presents Stimulus A, one actually presents a combina-
for perceptual learning, in which tion of unique and shared elements (a and x together);
simple exposure to stimuli A and if one presents Stimulus B, one similarly presents b
B may facilitate the learning of a and x. If Stimulus A is paired with a reinforcer, a and
discrimination between the two. x would both acquire associative strength. The organ-
How Stimuli Guide Instrumental Action   315

ism would respond to Stimulus B (bx) and therefore show generalization


because the common element (x) had received some conditioning.
When combined with what we already know about learning, this way
of conceptualizing Stimuli A and B actually leads to at least three mecha-
nisms that can contribute to perceptual learning. First, notice that preex-
posure to Stimuli A and B (as in the Gibson and Walk experiment) would
really involve exposure to the two compounds, ax and bx. Because neither
Stimulus A or B is reinforced in the preexposure phase, the consequence
is that ax and bx would become latently inhibited—they would therefore
be more difficult to condition in the next phase. But notice what actually
happens to the separate elements. If there are four preexposures to Stimulus
A and four preexposures to Stimulus B, there would be four preexposures
to element a, four preexposures to element b, and eight preexposures to
element x. That is, the common elements receive more latent inhibition than the
unique elements. This means that the common elements should be espe-
cially difficult to condition. Also, because generalization depends on the
conditioning of common elements, there will be less generalization (and
easier discrimination) between Stimulus A and Stimulus B. We have just
uncovered one explanation of perceptual learning.
Some nifty experiments by Mackintosh, Kaye, and Bennett (1991) sup-
port this approach. These experimenters used a taste aversion procedure
in which rats drank saline and sucrose solutions that could be mixed with
a third flavor, lemon. We will call saline, sucrose, and lemon, stimuli a, b,
and x (respectively), just to keep them consistent with Figure 8.15. The idea
behind the experiments is sketched in Table 8.1. Mackintosh et al. (1991)
knew that if rats simply received a pairing of saline (a) with illness and were
then tested with sucrose (b), there would be no generalization of the taste
aversion from a to b. But things were different in the experiment, because
the rats received conditioning with the saline-lemon compound (ax) and
then testing with the sucrose-lemon compound (bx). As demonstrated by
Group 1, there was a lot of generalization of the aversion from ax to bx,
which is not surprising given the common lemon flavor (x). But if the rats
had first received preexposures to ax and bx (Group 2), there was much less
generalization between them. This result illustrates perceptual learning;

TABLE 8.1 Design and results of a perceptual learning experiment


Result
Phase
Was there generalization
Group Preexposure Conditioning Test between ax and bx?
1 — ax+ bx? Yes
2 6 ax–, 6 bx– ax+ bx? No
3 12 x– ax+ bx? No
Source: From Mackintosh, Kaye, & Bennet, 1991.
316  Chapter 8

preexposures to the compound stimuli made them easier to discriminate.


The interesting group was Group 3. This group received the same preex-
posures to x—but alone, without a and b. After conditioning with the ax
compound, this group also showed little generalization from ax to bx. This
result tells us that preexposure to the common element (x) is alone sufficient
to make it easier to discriminate the compounds. Perceptual learning can
thus result, at least in part, from latent inhibition to common elements.
But there is apparently even more than this going on, as other experi-
ments reported by Mackintosh et al. (1991) suggested. McLaren et al. (1989)
noted that during exposures to complex cues, animals will learn to associate
the various elements (e.g., Rescorla & Durlach, 1981). During ax, they might
learn to associate a with x; and during bx, they might learn to associate b with
x. (This learning is simply sensory preconditioning, a phenomenon that we
discussed in Chapter 3.) But think about an interesting consequence of this.
Because x is associated with b, x will retrieve b when the rat is being exposed
to ax. Casually speaking, the rat will “expect” b during ax, but it is not there.
Based on what we know about inhibition in classical conditioning, the rat
will adjust its expectations accordingly: It will learn to expect “no b” in the
presence of a. For exactly the same reason, the rat will also learn to expect
“no a” in the presence of b. McLaren et al. thus noted that the familiar rules
of associative learning predict that mixed exposures to ax and bx will create
inhibitory associations between a and b. This is a second reason that preex-
posure to ax and bx might reduce generalization between them. Consistent
with the idea, Dwyer, Bennett, and Mackintosh (2001) demonstrated inhibi-
tory associations between saline and sucrose when rats receive intermixed
exposures to saline-lemon and sucrose-lemon drinks. Animals can also form
inhibitory associations between flavors in other, related, flavor exposure pro-
cedures (e.g., Espinet, Artigas, & Balleine, 2008). Inhibition between unique
elements is thus a second mechanism behind perceptual learning.
There is a third mechanism as well. If the stimuli are especially complex
(i.e., composed of a large number of different features), an organism might
be able to sample only a small portion of the elements on any given trial.
If we associate those elements on each trial and then sample and associate
some new elements on each successive trial, a whole network of intercon-
nected elements can be built up over trials. In effect, associations between
the different elements will create a “unitized” representation of the stimulus
(McLaren & Mackintosh, 2000) in the same way that different features might
call up a unitized category. Such a representation would allow a subset of
the features sampled on any given trial to activate the entire representa-
tion (this process is sometimes called pattern completion and is a property of
many connectionist models, e.g., McClelland & Rumelhart, 1985). There is
evidence that unitization does operate. If a rat is put into a box and imme-
diately given an electric shock, it will not learn to associate the box with the
shock (e.g., Fanselow, 1990). But if the shock is delayed for a minute or two
(so the rat has more time to learn about the box) or if the rat is exposed to
the box for a minute or two the day before being placed in the box and im-
How Stimuli Guide Instrumental Action   317

mediately given a shock, the rat will readily learn about the box (Fanselow,
1990; Kiernan & Westbrook, 1993). Without the extra exposure, there is no
unitized representation of the box. Rudy and O’Reilly (1999) have further
shown that different features must be experienced at the same time (rather
than presented separately) if exposure to the different features is to facilitate
conditioning. Animals do seem to form representations of complex stimuli
when they are simply exposed to them. Unitization is thus a third possible
mechanism that can lead to perceptual learning.
There are several points to remember from all this. First, two stimuli can
become more discriminable if we are merely exposed to them without dif-
ferential reinforcement. Second, this discrimination might occur if stimuli
are composed of many different elements that can be latently inhibited or
interassociated. Third, by taking this sort of position, we find that the laws
of learning that we have discussed in previous chapters are once again
actually happening everywhere. The basic learning principles we have
already studied seem to apply here, although there is probably room for
additional processes (e.g., see Mitchell & Hall, 2014).
Mediated generalization and acquired equivalence
Other learning processes can actually lead us to generalize more between
two different stimuli as a function of experience. The idea is that if two
stimuli are each associated with a common third stimulus, we will begin
to generalize between the stimuli. The idea is simple but powerful: If you
associate both Dave and Bill with motorcycles, you may start treating Dave
and Bill as being more alike.
In a simple demonstration of this phenomenon (Table 8.2), Honey and
Hall (1989) paired two auditory cues (A and N) with food. The common
association with food was expected to increase the generalization between
A and N. To test this hypothesis, N was subsequently paired with an electric
shock on several trials so that N aroused fear. Honey and Hall then tested
fear to A and also to a new auditory cue, B. Fear generalized from N to A
in this group, but not to B, and there was no generalization in a control
group that did not have A and N paired with food. Honey and Hall (1991)
reported a similar effect when stimuli like A and N were paired with a
neutral stimulus (rather than paired with food). The point is that when two
stimuli are associated with something in common, we tend to treat them
as if they are more equal. This effect is known as acquired equivalence,

TABLE 8.2  Design of a mediated generalization experiment


Group Phase 1 Phase 2 Test
1 A — Food, N — Food N — Shock A? B?
2 A — Food, N — No food N — Shock A? B?
Source: From Honey & Hall, 1989.
318  Chapter 8

“Start” cue

Time
Sample stimulus

Comparison stimuli

Figure 8.16  Example of a sequence of stimuli used in a matching-to-sample


experiment. The circles are a row of response keys at the front of a pigeon
chamber. (Gray circles indicate keys that are not illuminated.) The trial begins
with a “start” cue, which the pigeon typically pecks to receive the sample
stimulus. The sample stimulus stays on for a short period of time, and then two
comparison keys are turned on. Pecks to one of these keys are reinforced. In
matching-to-sample, pecks to the comparison stimulus that matches the sample
are reinforced. In delayed matching-to-sample (DMTS), a delay can be inserted
between presentation of the sample and the comparison stimuli.

or mediated generalization. Generalization between A and B is “medi-


ated” by their association with a common element; A and B are thought to
retrieve a common representation.
There are many illustrations of this kind of effect. For example, pigeons
can be run in experiments using variations on the matching-to-sample
procedure illustrated in Figure 8.16. On each trial, after an initial “start”
cue (that gets the subject ready for what is to follow), a “sample” stimulus
is presented on a central key. For example, the key might be illuminated
Red. Immediately after this stimulus, the bird is shown two “compari-
son” stimuli on the side keys: for example, a Red key and a Green key. In
a matching-to-sample experiment, the bird is reinforced for pecking the
comparison key that matches the sample—after a Red sample, the Red
comparison (not the Green one) is reinforced. On trials when the sample
stimulus is Green, the Green comparison (not the Red one) is reinforced.
Many variations on this procedure are possible (as we will see shortly).
They always involve a conditional discrimination because either compari-
son stimulus is reinforced on half of the trials, but the one that will be rein-
forced on any particular trial depends on (i.e., is conditional on) the sample
cue that came before it. Thomas Zentall (1998) reviewed a large number
of experiments, which all suggest that when different samples signal that
the same comparison stimulus is correct, the different samples are treated
as equivalent (Kaiser, Sherburne, Steirn, & Zentall, 1997; Urcuioli, Zentall,
Jackson-Smith, & Steirn, 1989; Zentall, Steirn, Sherburne, & Urcuioli, 1991).
For example, birds will generalize between Red and Vertical Line samples if
they both signal that a Dot comparison stimulus will be reinforced. Zentall
Thomas Zentall and Peter Urcuioli argued that this generalization implies that samples with
How Stimuli Guide Instrumental Action   319

TABLE 8.3  Mediated generalization in a categorization experiment


Phase 1 Phase 2 Test
Original training New responses Old categories Result
People:  Key 1+ People:  Key 3+ —
Cars:  Key 1+ — Cars:  Key 3? Key 4? Cars:  Peck Key 3
Flowers:  Key 2+ Flowers:  Key 4+ —
Chairs:  Key 2+ — Chairs:  Key 3? Key 4? Chairs:  Peck Key 4
Source: From Wasserman et al., 1992.

common comparisons become coded similarly in the bird’s memory. The


Red and Vertical Line samples will acquire a so-called “common code”,
which is much like saying that they have acquired equivalence. See Zentall,
Wasserman, and Urcuioli (2014) for a recent review of this work.
Acquired equivalence and mediated generalization may play an in-
teresting role in categorization, where they may allow the creation of
ever-larger categories. For instance, Wasserman, DeVolder, and Coppage
(1992) provided an intriguing extension of acquired equivalence to the
categorization work described earlier (see also Astley, Peissig, & Wasser-
man, 2001; Astley & Wasserman, 1999). The scheme of the experiment is
summarized in Table 8.3. In the first phase, birds learned to categorize
pictures of people, cars, flowers, and chairs. In this experiment, unlike the
experiments we discussed earlier in this chapter, the birds responded in
the presence of two types of pictures by making the same response (e.g.,
for some birds, pictures of people and cars required a peck to Key 1 for a
reinforcer, whereas pictures of flowers and chairs required a peck to Key
2). Once the birds were responding well, one category from each of the
pairs was associated with a new response in a second phase. For example,
people pictures now required a peck to Key 3 for a reinforcer, and flower
pictures now required a peck to Key 4. The birds learned this, too. But the
best part was the final test, when the birds were given a chance to peck
Key 3 or Key 4 in the presence of cars and chairs—pictures that had never
been associated with these two keys. Remarkably, the birds responded to
the new key that the associated category had been linked with. That is, the
birds behaved as if they had combined two different subcategories (people
+ cars vs. flowers + chairs) into superordinate categories. I hope that you
can see the similarity to the simpler Honey and Hall experiment (see Table
8.2). In either case, animals generalize between physically different stimuli
if these stimuli have been associated with a common stimulus or response.
Because people and cars (and flowers and chairs) do not look much
alike, the birds were not merely responding to the pictures’ physical simi-
larities. Instead, they were responding to the psychological similarities cre-
ated by their associations with a response to a common key. Lea (1984)
suggested that an ability to categorize stimuli on a basis that transcends
320  Chapter 8

mere physical appearance is an important step toward demonstrating true


“conceptualization” in animals. Urcuioli (2001) called it “categorization
by association”; another term is “associative concept learning” (Zentall et
al., 2014). Superordinate categories like these are not unusual in everyday
experience. You know that shoes, belts, and jackets (or lamps, chairs, and
beds) are similar even though they are physically different. Each of these
subcategories is associated with a common response (“clothing” or “furni-
ture”) that helps put the subcategories together into a larger one.
What this research tells us is that generalization can occur between
cues that have little in the way of physical similarity. Common associations
provide a kind of glue that gives different cues a psychological similarity
or equivalence. Research thus indicates that learning processes can make
physically similar stimuli different—as in perceptual learning, or physi-
cally different stimuli similar—as in acquired equivalence and mediated
generalization.
Conclusion
Organisms generalize lawfully and respond in the presence of stimuli
that are similar to stimuli that have been associated with reinforcers or set
the occasion for operant responses. Generalization is at least partly due
to the physical properties of the stimuli; we generalize between stimuli
that share common features or elements. A major theme of this section,
however, is that generalization is also influenced by psychological pro-
cesses: It is powerfully influenced by learning. Discrimination training
can sharpen the shape of generalization gradients and create surprising
new peaks in responding. It does so by introducing inhibition. Mere
exposure to stimuli can also help make these stimuli more discriminable
by allowing various processes that create perceptual learning. And as-
sociating stimuli with common events can also increase generalization
between them. These basic facts of stimulus control provide tools that
can be used to understand complicated examples of stimulus control,
including (of course) categorization.

Another Look at the Information Processing System


If stimuli are going to guide instrumental action, modern thinking sug-
gests that they will also need to be processed in the familiar information
processing system—or standard model of cognition—that we have seen
before (see Figures 1.14 and Figure 4.10). As you will recall, stimuli in the
environment are thought to be processed in a predictable sequence. Their
perceptual properties are first analyzed and put into a sensory memory
store. If these stimuli are further attended to, the information is then trans-
ferred to a hypothetical short-term (or working) memory store, where it
can be rehearsed, further processed, and eventually put into long-term (or
reference) memory. Not surprisingly, each of these processes plays a role
in allowing stimuli to guide instrumental action.
How Stimuli Guide Instrumental Action   321

Visual perception in pigeons


Thanks in part to computer graphics technology, a great deal of interest-
ing research has investigated visual perception processes in the pigeon
(e.g., Cook, 2001a; Lazareva, Shimizu, & Wasserman, 2012). Much of this
research has been inspired by earlier research on perception in humans.
For example, pigeons and humans may recognize visual objects based on
their perception of primitive components called geons (for “geometric
ions”). According to “recognition by components” theory (Biederman,
1987), humans must first perceive these simple components before they can
recognize an object’s overall structure. In a line drawing, most geons occur
at line intersections. Using familiar techniques, a number of studies have
confirmed the importance of recognition by components in pigeons (e.g.,
Wasserman & Biederman, 2012). For example, Van Hamme, Wasserman,
and Biederman (1992) first trained pigeons to peck a different key in the
presence of each of the four drawings shown in the left column of Figure
8.17. After the birds were responding well, the experimenters presented
test trials with the stimuli shown in the middle and right-hand columns.
Stimuli in the middle column are made up of brand-new line segments, but
geons are still detectable at their intersections, and the human eye perceives

Training stimuli Geons preserved Geons not preserved

Penguin

Figure 8.17  Stimuli used


in the experiment by Van
Turtle Hamme et al. (1992). Pigeons
first learned to peck a unique
key in the presence of each of
the four stimuli shown in the
left column. When tested with
the new drawings shown in the
middle column, the birds con-
Rolling
pin tinued to respond reasonably
accurately, but when tested
with the new drawings shown
in the right column, perfor-
mance fell apart. Stimuli in the
right column have the same
line segments as stimuli in the
Lamp
middle column, but the geons
are not preserved. (From Van
Hamme et al., 1992.)
322  Chapter 8

them as very similar to the stimuli at the left. The birds responded the same
way: Although they did not generalize perfectly to these new stimuli, their
responding was very far above chance. In contrast, there was virtually
no generalization to the images shown in the right-hand column, which
contains the same line segments as the middle column, but scrambled. No
geons are available in these stimuli. The results thus suggest that the birds
did not perceive the original stimuli as haphazard sets of squiggles and
lines. They recognized coherent patterns and generalized when the test
stimuli preserved the basic geons.

Feature: shape

Feature: color

Feature: conjunctive
Figure 8.18  Examples of computer stim-
uli presented to pigeons by Cook (1992).
Unique regions in the upper two displays
“pop out” from the background. In contrast,
the unique region in the bottom display
requires more attention and effort to find—
it is made up of a combination (or conjunc-
tion) of two dimensions that are varied in
the background. (From Cook, 1992; color
images courtesy of Robert Cook.)
How Stimuli Guide Instrumental Action   323

A line of research by Robert Cook (see Cook, 2001b) suggests that pi-
geons may also be like humans in how they perceive stimuli embedded
in a background of other stimuli. It is easy for you to see the Ts among the
circles and the squares in the top of Figure 8.18 or the green circles among
the other squares and circles in the middle of the same figure. The unique
regions “pop out” from the background; they are processed automatically,
without requiring attention and additional processing (Treisman, 1988).
In contrast, the unique region in the bottom display is not as easy to find.
Robert Cook
It does not have a single unique feature, but is instead created by the
combination of two of the features varied in the background: a blue circle
among orange circles and blue squares. Here, the unique region does not
pop out, and it takes more focus and attention to find it. Cook (1992) pre-
sented these and similar arrays on a computer screen that was viewed by
pigeons. Responding was reinforced when the birds pecked at the unique
regions on the screen. (Peck location could be detected because Cook’s
images were presented on a touchscreen.) The birds were more accurate
at identifying regions that differed from the surrounding area by only one
feature (as in the top and middle displays). The birds were not as good at
finding regions made up of combinations of more than one dimension (the
“conjunctive” display at the bottom). Humans given the same stimuli were
also slower to point at the conjunctive displays, suggesting that the two
species might process the arrays similarly. Other experiments have identi-
fied further similarities—and differences—between pigeons and humans
in their detection of features in visual displays (e.g., Cook, Qadri, & Keller,
2015; Peissig & Goode, 2012; Spetch & Weisman, 2012).
Other experiments have examined the pigeon’s ability to discriminate
between displays that differ in their variability (for reviews, see Cook &
Wasserman, 2012; Wasserman, Young, & Cook, 2004). For example, Was-
serman, Hugart, and Kirkpatrick-Steger (1995) showed pigeons displays
of icons like those within each square in Figure 8.19. Pecks to a separate
key were reinforced when all the items were the same; conversely, pecks
to another key were reinforced when all the items were different. (Note
that this setup is merely yet another example of establishing stimulus
control by discrimination training.) After extensive training on 16 displays
of each type, the birds were doing well, pecking the correct key more
than 80% of the time. They were then tested with new “same” and “dif-
ferent” displays made up of novel items. Their performance was not bad
here either. Although generalization to the new stimuli was not perfect,
it was well above chance (see also Cook, Cavoto, & Cavoto, 1995; Young
& Wasserman, 1997). Other experiments have shown that same and dif-
ferent judgments are not affected much by small displacements of the
icons, so the displays were not in perfectly straight rows and columns
(Wasserman, Frank, & Young, 2002). Most important is that whereas birds
identify “same” displays accurately regardless of the number of icons
contained in them (from 2 to 16), “different” displays are not identi-
fied as “different” until roughly 8 or more icons appear in the displays
324  Chapter 8

Same

Different

Figure 8.19  “Same” (top) and “different” (bottom) displays used in the experi-
ment by Wasserman et al. (1995). (From Wasserman et al., 1995.)

(Young, Wasserman, & Garner, 1997). The best predictor of the birds’ same
and different judgments often turns out to be “entropy,” a mathematical
measure of the variability contained in the display. Roughly speaking,
the entropy value of a display increases as the number of different items
in it increases; a “same” display has an entropy of 0. Thus, birds peck
“same” when entropy is low and “different” when entropy is high. Pecks
at the “same” key gradually shift to pecks at the “different” key as a test
display’s entropy is gradually increased.
There is some controversy about how entropy should be interpreted. In
one view, entropy mainly captures the visual texture in the display. Same
displays (with low entropy) have a smoother visual texture than different
displays (with high entropy), which look rough. The birds thus respond
to the smoothness of the display. Another possibility, though, is that the
birds compare the individual icons and respond according to whether they
are truly same or different. In this view, the decision to respond is made
at a more advanced level of information processing than perception (e.g.,
Wasserman et al., 2002). The idea that the pigeons are making a same/dif-
ferent judgment would be more convincing if they performed better with
small displays—when the displays contain only 2 items instead of 16 or
so. Unfortunately, as already mentioned, pigeons do not correctly identify
different displays as “different” until the arrays contain roughly 8 items
(see Young et al., 1997; but see Castro, Kennedy, & Wasserman, 2010; Katz
& Wright, 2006). (The value of entropy is low when different displays
have a small number of items.) Some have argued that true same/differ-
ent judgments may require language (Premack, 1983). It is worth noting,
How Stimuli Guide Instrumental Action   325

though, that not all discriminations between same and different displays
in pigeons are a simple function of entropy (e.g., Cook, Katz, & Cavoto,
1997; see also Castro, Wasserman, & Young, 2012; Katz & Wright, 2006).
For the time being, it may be safest to conclude that animals are somehow
good at discriminating variability—an important dimension of stimulus
displays in the natural world (see Wasserman et al., 2004).
Attention
Many of the basic perceptual processes sketched above are thought to
operate automatically, without requiring either attention or mental effort.
We nonetheless already know that attention can be important in Pavlovian
learning (see Chapter 4), and it is similarly important in stimulus control.
For example, in the Castro and Wasserman (2014) experiment described
earlier (see Figure 8.6), pigeons pecked at features of a display when they
were relevant for predicting reward, as if they were paying attention to
them, and pecked less to features that were irrelevant. Donald Blough
(1969) had provided earlier evidence suggesting that pigeons pay attention
to the color of a pecking key or to the pitch of a tone primarily when the
visual or auditory dimensions were made relevant. As we saw in Chap-
ters 3 and 4, whether or not we pay attention to a dimension depends on
whether the dimension is informative.
Attention processes also come into play when we search for items in our
environment. I recently found a lost checkbook that was hidden among the
many papers cluttering my desk, but finding it took a fair amount of focus
and concentration (not to mention patience). Consider animals analogously
searching on the ground for food items. When prey items are abundant but
cryptic (i.e., hard to detect because they blend into the background), there
is a payoff for searching for (attending to) features that might help detect
them. The animal is said to form a search image (Tinbergen, 1960). To in-
vestigate the process, experimenters presented birds with a series of displays
that contained cryptic moths or pieces of grain, either of which needed to
be pecked for reinforcement (e.g., Bond, 1983; Langley, 1996; Pietrewicz &
Kamil, 1981; Reid & Shettleworth, 1992). (We saw an example of this type
of method in the blue jay experiment described at the start of this chapter.)
When cryptic items are presented in “runs” (i.e., repeated many times over
a series of trials), birds detect them more quickly and/or more accurately
than when the cryptic items are intermixed with trials having other items.
Detection of the item on a trial may boost attention to its features, aiding its
detection on the next trials (see Plaisted, 1997, for another interpretation).
The idea that prey detection is affected by attention is also consistent with
experiments on “divided attention.” Your attention is limited, as you no
doubt know from when two or more people have tried to talk to you at
the same time. (Typically, you must switch your attention between the two
speakers.) Dukas and Kamil (2001) had blue jays finding cryptic prey items
on a computer screen. In some trials, a picture of the cryptic item was pre-
sented just before presentation of a display in which it was hidden. In other
326  Chapter 8

Figure 8.20  Attentional priming. (A) (B)


Pigeons performed in a visual 2.0 1.00
search task in which they were
reinforced for finding and pecking 0.95
at target letters (e.g., A or L) pre- 1.8 Control

Reaction time (s)


Reaction time (s)
sented among other letters and
0.90
numbers on a video screen. (A)
1.6 Cue
Pigeons took less time to find the Mixed
0.85
target letter when the target was
Runs
repeatedly presented on “runs” 1.4
0.80
of trials. (B) Pigeons also found
the target letter faster when it
was cued by another stimulus 1.2 0.75
ahead of time. (After Blough,
1989; Blough & Blough, 1997.)

trials, two different prey items were presented before a screen in which one
of the items was hidden. Detection rate was slower on the two-item trials,
perhaps because the blue jay was dividing its attention between two images.
Patricia Blough (e.g., 1989, 1991) noted that search image effects are
similar to attentional priming effects that are easy to observe in visual
search tasks in humans (e.g., Posner & Snyder, 1975). For example, in a
pigeon version of such a task, Blough (1989) had pigeons find and peck
at small-sized letters (e.g., an A or an L) that could be presented on a
computer screen among many distractors (other letters and numbers). As
shown in Figure 8.20A, the birds took less time to find and peck a letter
if it occurred in runs of several trials, as above. The birds also found and
pecked a particular letter more quickly if it was “cued” ahead of time by
a signal that was associated with it, such as bars presented on the sides of
the screen (Figure 8.20B; see also Blough, 2002). Both the previous run
and the signal are thought to focus attention on features of the primed
item. Interestingly, priming effects seem to disappear if pigeons are given
extensive practice with the task (Vreven & Blough, 1998), as if the birds shift
from a controlled or effortful style of processing to a more rapid, automatic
one (see Chapters 4 and 10). Other research also suggests that repetition of
distractor items in the background increases accuracy—that is, in addition
to learning to attend to the target, the birds can learn to ignore distractors
in the background (Katz & Cook, 2000).
Working memory
In a changing environment, stimuli are often present for only a short pe-
riod of time. However, the fact that information can be held for a while in
working memory allows even brief stimuli or events to guide instrumental
behavior after the stimuli are gone. Working memory has been extensively
studied using the methods of stimulus control. For example, the matching-
to-sample procedure described earlier (see Figure 8.16) is easily adapted
How Stimuli Guide Instrumental Action   327

100 Figure 8.21  Pigeon working memory in the


15 pecks delayed matching-to-sample (DMTS) task.
90 5 pecks Different curves illustrate retention over differ-
1 peck ent delay intervals when the bird had to peck
Correct responses (%)

80
the sample stimulus 1, 5, or 15 times. (After
Roberts, 1972.)
70

60

50

0
0 1 2 3 4 5 6
Delay (s)

for this purpose. Recall that the pigeon is shown a “sample” stimulus (e.g.,
a Red key) and required to peck the same color key when given a choice
between a Red and a Green key (the “comparison” stimuli) to receive a re-
ward. In delayed matching-to-sample (DMTS), the pigeon is first trained
to respond this way, and then a delay is inserted between presentation
of the sample and the comparison stimuli on test trials. The pigeon must
now remember the sample (or perhaps remember what it should do next)
over the delay so as to peck the correct comparison stimulus. As shown in
Figure 8.21, performance declines as the interval between sample offset
and comparison onset increases. In the study shown, a classic by William
Roberts (1972), the pigeon’s performance is getting close to chance (50%)
when the delay between the sample and comparison is a mere 6 seconds.
The pigeon’s short-term memory seems very short indeed.
Working memory in this task is influenced by several factors. In the
experiment shown in Figure 8.21, the bird was required to peck the sample
stimulus either 1, 5, or 15 times before the sample went off and the delay
began. As the curves show, performance was better with the larger peck
requirement. Because the sample stayed on longer when the birds had
more pecks to make, this kind of result is usually interpreted to mean that
increasing exposure to the sample increases working memory. A second
factor that improves performance is practice. For example, Doug Grant
(1976) found that pigeons that were tested over thousands of trials with
several delays eventually had retention functions that stretched out to ap-
proximately 60 seconds (rather than only 6 seconds). Sargisson and White
(2001) found that training with the 6-second interval essentially eliminated
all forgetting over the intervals shown in Figure 8.21. One implication is
that at least some of the “forgetting” implied in results like Figure 8.21 is
caused by generalization decrement. That is, the birds fail to generalize
from training with no delay to tests with longer delays (see Zentall, 1997,
for an interesting discussion of this kind of possibility in several settings).
328  Chapter 8

A third factor that influences memory in pigeon DMTS is our old friend
interference (see Chapter 5). Retroactive interference—in which information
after an event can hurt memory of the event—is thought to be created by
turning on an overhead light during the interval between the sample and
the comparison stimuli (e.g., Roberts & Grant, 1978). The brighter the light,
the more it hurts performance. Proactive interference, in which informa-
tion from before a trial hurts performance on the trial, seems to play an
especially important role in DMTS. It can be created by making the sample
key one color (say, green) for some seconds before the true sample color
(say, red) occurs. This change makes performance worse in the DMTS task
(e.g., Grant & Roberts, 1973). Memory on one trial can also be hurt by con-
flicting information from previous trials. For example, performance on a
trial in which the sample is red is worse if the sample on the previous trial
was green (e.g., Edhouse & White, 1988; Grant 1975; Roberts, 1980; White,
Parkinson, Brown, & Wixted, 2004). Notice that when the sample was dif-
ferent on the previous trial, the bird probably made a different response,
too; it turns out that both the incompatible sample and the incompatible re-
sponse create interference on the next trial. Interference appears to weaken
with longer intertrial intervals (e.g., Roberts, 1980), although explaining
this effect turns out to be complex (e.g., Edhouse & White, 1988). There
is no question, though, that interference has powerful effects on working
memory as studied in DMTS.
Another method used to study working memory involves rats rather
than pigeons. Figure 8.22 illustrates a radial maze developed by the David
Olton (Olton & Samuelson, 1976). The maze is elevated above the floor,
and the different arms radiating from the center have no walls. As the rat

Figure 8.22  An eight-arm


radial maze. (After Roitblat,
1987.)
How Stimuli Guide Instrumental Action   329

moves around the maze, it can see different things in the room beyond the
maze (“extra-maze cues”). On each trial in a typical experiment, the rat is
allowed to explore the maze freely and retrieve all the bits of food (often
Froot Loops) that have been left at the end of each of the eight arms. Once
a bit of food is retrieved, that arm is left empty until the next trial. The rat
becomes very efficient at this task; for example, if the experimenter allows
the rat to leave the center part of the maze only 8 times, it will enter an
average of 7.5 arms without repeating itself (Olton, 1978). On a similar
maze with 17 arms, the rat enters more than 14 different arms on the first 17
choices (Olton, Collison, & Werz, 1977). Performance is well above chance.
It turns out that the rats are not systematically making their way through
the maze by visiting adjacent arms. And they do not merely avoid odors
that they leave behind on the visited arms (a possible sign saying “I’ve
been here already”). Instead, they remember places that they have visited
as defined by cues outside the maze. For example, Suzuki, Augerinos, and
Black (1980) surrounded the maze with a black curtain, which eliminated all
extra-maze cues. (Looking out from a maze surrounded by a black curtain
is a little like looking out into deep space.) They then pinned odd items
(things like a toy bird, a yellow card, Christmas tree lights, etc.) to the black
curtain near the end of each arm. After the rat had learned to perform on
the maze, on a test trial it made a few choices and was then confined to
the center of the maze. At this time, the experimenters rotated the curtain
180 degrees (Figure 8.23). When released, the rat likewise rotated its next
choices so that it approached the items that it had not approached before.
This meant that the rat walked down many arms it had recently traveled.
The rats were thus remembering the places as defined by extra-maze cues
rather than individual arms themselves or odors they had left on the arms.
And, interestingly, if the items were interchanged or “transposed” rather

(A) Original (B) Rotate (C) Transpose

A E F
H B D F A D

G C C G H B

F D B H E G
E A C

Figure 8.23  (A) Extra-maze cues positioned at the end of each arm of a radial
maze could be rotated or transposed. (B) When the cues were rotated after the
rat made some initial choices, the next choices rotated as well. (C) In contrast,
when the cues were transposed, the next choices dropped to chance. (After
Ed: A, B,et
Suzuki C al.,
part 1980.)
labels added to the art and caption. Text should be updated as well/ KE
330  Chapter 8

than merely rotated (see Figure 8.23), the rat’s choices were severely dis-
rupted. This result suggests that the rats had been using a configuration
of several landmarks to define each location in the maze.
To perform accurately on a given trial, the rat must be remembering lo-
cations in working memory as it makes choices on the maze. With the right
training, the rat can remember its first four choices for at least 4 hours, after
which its working memory (as measured by its next four choices) begins
to decline (Beatty & Shavalia, 1980). The rat also needs to use long-term
or reference memory to perform. That is, the rat also needs to remember
that food is present on the arms at the start of a trial. Working memory and
reference memory can both be studied (and distinguished) on the radial
maze. For example, Olton and Papas (1979) always baited some of the
arms but never baited several others. The rats learned to avoid the arms
that were never baited and work their way (without repeating themselves)
through the arms that were baited. Surgical damage to the neurons coming
out of the hippocampus, a part of the brain known to influence learning
and memory, hurt the ability of the rats to visit the baited arms without
repeating themselves, but it did not affect their ability to avoid the never-
baited arms. The results thus suggest that the hippocampus plays a role
in working memory rather than reference memory.
We can be even more specific about how working memory operates in
this task. If you think about it, the rat could be using its memory in two
ways. First, as it makes its choices throughout a trial, the rat might simply
remember the places that it has recently visited—a retrospective code of
places from the past. Memory would be especially taxed on later choices,
when there are more visited locations to remember. On the other hand, the
rat might remember places on the maze that it has yet to visit, a so-called
prospective code. (Prospective coding looks forward rather than back-
ward in time.) In this case, the memory load would be highest on the early
choices, not the later choices, because that is when the number of sites yet
to be visited is high. Robert Cook, Michael Brown, and Donald Riley (1985)
found evidence that the rat uses both types of codes. They first trained rats
on a 12-arm version of the maze. Then, on different test trials, they removed
the rats from the maze for 15 minutes following the second, fourth, sixth,
eighth, or tenth choice on the trial. The rats were then returned to make the
remaining choices. If the rat was solely using a retrospective code, inserting
the retention interval after the tenth choice should have been especially
disruptive—that was when the rat was holding a lot of information in
memory. But after the second choice, things were relatively easy, and a
retention interval should not have as much effect. In contrast, if the rat was
solely using a prospective code, a retention interval after the second choice
would have been most difficult to handle, and one after the tenth choice
should have been easy. Amazingly, the results (Figure 8.24A) suggested
that there was little disruption when the interval was inserted after either
early or late choices; the biggest disruption occurred in the middle—after
the sixth choice (see also Kesner & DeSpain, 1988)! Apparently, the first few
How Stimuli Guide Instrumental Action   331

(A) Rats (B) Pigeons


0.3 0.3

Mean adjusted errors/trial


Deviation from control

0.2 0.2

0.1 0.1

0 0

–0.1 –0.1
0 2 4 6 8 10 1 2 3 4
Point of interpolation Point of interpolation

Figure 8.24  Retrospective and prospective coding in working memory.


(A) Error scores of rats in a 12-arm radial maze when a retention interval was
inserted after choices 2, 4, 6, 8, or 10. Rats behaved as if they remembered the
places that they had been for the first several choices, and then they switched
to remembering the places that they still needed to visit. (B) Similar results
with pigeons required to peck five different keys without repeat; delays were
inserted after choices 1, 2, 3, or 4. (A, after Cook et al., 1985; B, after Zentall
et al., 1990.)

choices were guided by remembering places past, and later choices were
guided by remembering places yet to go. This flexibility is remarkable. But
it might also be efficient. By switching from retrospective to prospective
coding in the middle of the trial, the rat only needed to remember 5 or 6
locations at one time instead of 11 or 12.
Pigeons may likewise use either retrospective or prospective coding in
the DMTS task. That is, during the delay between sample and comparison
stimuli, they might remember either the sample (a retrospective code) or
the comparison stimulus that they need to peck (a prospective code). The
distinction is especially clear in “symbolic” delayed matching-to-sample,
where a red sample (for instance) might signal that pecking a horizontal
line will be reinforced and a green sample might signal that pecking a verti-
cal line will be reinforced. Is the bird remembering the red sample or green
sample during the delay, or is it remembering to peck the upcoming verti-
cal or horizontal line? There is evidence that pigeons can use either type
of code (e.g., Roitblat, 1980; Urcuioli & Zentall, 1986). And as we just saw
for rats in the radial maze, the pigeons switch between the two types of
codes with impressive flexibility. Zentall, Steirn, and Jackson-Smith (1990)
arranged an experiment with pigeons that was similar to the rat experiment
by Cook et al. (1985) (Figure 8.24B). At the start of a trial, five different keys
were illuminated. The pigeon’s job was to peck at each of the keys, with-
out a repeat, to receive reinforcement for every response. Once each correct
key was pecked, reinforcement was delivered, and all five keys were then
turned off for 2.5 seconds before they came back on again. After the birds
had learned the task, the experimenters inserted longer delays between the
332  Chapter 8

different choices (first, second, third, etc.) on different trials. Once again, as
we saw in the radial maze, delays after the early and later choices were not
as disruptive as delays after the middle choices. The birds were apparently
using a retrospective code for the first few choices and then a prospective
code for the final choices. Studies of working memory thus suggest that
animals can learn to use their memories actively and efficiently.
Reference memory
We have already discussed long-term (or reference) memory earlier in this
book (especially Chapters 4 and 5). To briefly review, this sort of memory is
more permanent than working memory, and it has a much larger capacity.
Information from long-term memory is thought to be activated by retrieval
cues. Forgetting is generally held to be a result of interference (either pro-
active or retroactive) or retrieval failure caused by a mismatch between
contextual cues in the background at the time of learning and retrieval.
Researchers studying human long-term memory have often distinguished
between several types. For example, Squire (1987) has distinguished between
procedural memory and declarative memory. Procedural memory is one’s
memory for behavioral procedures, or how to do things (e.g., ride a bike,
swim, or play a guitar). Declarative memory, in contrast, is essentially ev-
erything else. For instance, semantic memory is one’s memory for things
like words and facts (e.g., Tulving, 1972); it is the kind of information that is
remembered when you have a good night playing Trivial Pursuit. Episodic
memory, on the other hand, is memory for specific personal experiences,
such as what happened when you visited your aunt and uncle yesterday or
when you last traveled to Washington, D.C. This type of memory involves
information about what, where, and when an event happened. The distinc-
tions between these forms of long-term memory have been tricky to study
in animals because animals cannot verbalize, and it is often not clear how
to conceptualize the reference memory involved in many laboratory tasks.
For example, is memory in a Pavlovian conditioning experiment procedural
(knowing what to do when the CS comes on) or declarative (knowing what
the CS means)? Because conditioning may involve both S-R associations
(procedural memory?) and S-S associations (declarative memory?), there is
probably an element of both involved.
It would seem especially hard to find evidence of episodic memory
in animals. Nicky Clayton and Anthony Dickinson nonetheless provided
interesting evidence in some wonderful experiments with scrub jays. These
birds naturally hide (or cache) bits of food in their environment and then
recover the food bits later (see Clayton, Griffiths, & Dickinson, 2000, for one
review). Clayton and Dickinson’s experiments took advantage of captive
birds’ tendency to take food items from a bowl and cache them in nearby
ice-cube trays filled with sand. (The birds recover the items later.) Consider
the following experimental arrangement (Clayton & Dickinson, 1999). In
an initial training phase, the birds were given a series of trials designed to
teach them something about peanuts and mealworms. On each trial, the
How Stimuli Guide Instrumental Action   333

birds were given a bowl of either peanuts or mealworms and were allowed
to cache them in a unique ice-cube tray that was decorated with unique
plastic Legos. They could then recover the cached items either 4 hours or
124 hours later. At the 4-hour interval, the peanuts and mealworms were
always fine. But at the 124-hour interval, two groups of birds had differ-
ent experiences. In the control group, the peanuts and mealworms were
once again intact and fine. But in the “Spoiled” group, the experimenters
had replaced the cached mealworms with mealworms that had been dried
and then soaked in disgusting dishwashing detergent. The worms had
essentially gone bad (or, spoiled) over the 124-hour period. The training
procedure thus allowed the birds in the Spoiled group to learn that meal-
worms, but not peanuts, are bad if they are recovered after several days.
In subsequent test trials, the birds first cached the peanuts and meal-
worms as usual, and during recovery tests 4 and 124 hours later, the birds
were given the trays in which they had cached the food items. On these tri-
als, however, the food had been removed from the trays and the sand had
been replaced, so the bird’s searching behavior could only be guided by the
memory of what they had cached and where. The idea of the experiment
and the results are summarized in Figures 8.25 and 8.26. In the control
group, the birds searched more for mealworms than peanuts at both the
long- and short-retention intervals. They preferred the mealworms and evi-
dently remembered what they had cached and where. The Spoiled group did
essentially the same at the 4-hour interval, but did something quite remark-
able at the 124-hour interval. At this time, their preference switched almost
exclusively to peanut sites over mealworm sites. The birds had learned that
mealworms are bad if caching occurred a long time ago, and they were using
this information to avoid the “spoiled” mealworms. To perform this way, the
birds must have remembered what they had cached, where they had cached
it, and when they had cached it, too. Clayton and Dickinson have therefore ar-
gued that this and related results indicate episodic-like memory in this bird.
According to subsequent researchers, scrub jays are not the only species
to have episodic-like memory; there have now been reports of it in several

Cache Recover
W Worms
W P 4 hours P Peanuts
W P Prefer worms
P W sW “Spoiled” worms

124 hours
sW P Prefer peanuts

Figure 8.25  Training procedure and predictions of what the “Spoiled” group
should do in the experiment testing episodic-like memory in scrub jays (Clayton
& Dickinson, 1999). During training, mealworms were “spoiled” when the birds
recovered them 124 hours—but not 4 hours—after caching (storing) them.
334  Chapter 8

(A) Control (B) Spoiled


8 8

Mean number of searches


Peanuts
6 6 Worms

4 4

2 2

0 0
4 hours 124 hours 4 hours 124 hours

Figure 8.26  Number of searches by scrub jays at peanut or mealworm sites


when testing occurred 4 hours or 124 hours after caching. (A) The control group
always preferred to search for mealworms over peanuts. (B) The same was true
for the “Spoiled” group when tested 4 hours after caching; after 124 hours,
however, these birds searched for peanuts instead of mealworms—they had pre-
viously learned that mealworms spoil after this interval of time (see Figure 8.25).
To perform this way, the Spoiled group must have remembered what they had
cached as well as where and when they had cached it. (After Clayton & Dickin-
son, 1999.)

other animals, including dolphins, primates, and other birds including


chickadees, magpies, and pigeons (for reviews, see Allen & Fortin, 2013;
Zentall, 2013). Jonathon Crystal and his colleagues have reported it in rats
(e.g., Babb & Crystal, 2006; Crystal & Smith, 2014; Zhou & Crystal, 2009).
Their experiments used the eight-arm radial maze (see Figure 8.22) in clever
new ways. For example, Zhou and Crystal (2009) put rats on the maze twice
every day, either in the morning or in the afternoon. On the first daily ex-
posure, a sample trial, the rats could explore a few randomly chosen arms
(the other arms were blocked). The rats found bits of chow on these arms,
except on a randomly chosen one, which contained a tasty bit of chocolate.
(Rats prefer chocolate to chow.) On the second trial a few minutes later,
which was a test trial, the rats could explore all eight arms. The new arms
all contained chow. But more important, if the trials occurred at 7:00 a.m.,
the arm that previously contained chocolate contained chocolate again; if
the trials occurred at 1:00 p.m., though, the chocolate arm contained noth-
ing. (Half the rats actually received the reverse arrangement.) With daily
training, the rats were more likely to return to the chocolate arm during
the morning test than in the afternoon. They also chose the chocolate arm
when the morning test was delayed by 7 hours. Here the rats returned
to an arm that had contained chocolate at 7:00 even though the test now
occurred at 2:00 in the afternoon! To perform this way, Zhou and Crystal
argued that the rats must have remembered the chocolate (what), the cor-
responding arm of the maze (where), and whether it had been sampled dur-
ing the morning or afternoon (when). Other studies suggest that animals
How Stimuli Guide Instrumental Action   335

can respond with this kind of what-where-when knowledge even when


a test trial is new and unexpected (e.g., Zhou, Hohmann, & Crystal, 2012;
see also, e.g., Singer & Zentall, 2007). Crystal and others have therefore
argued that rats and other species may indeed have episodic-like memory.

The Cognition of Time


One remarkable aspect of behavior is that it is very sensitive to time. For ex-
ample, if you are sitting in your car waiting for a red light to change but the
timer gets stuck, you will eventually realize that something is wrong. You also
become impatient when a waiter is slow to serve you dinner at a restaurant or
when a website is unusually slow to load. And when you buy a new and faster
computer, you might feel like familiar programs are running very quickly.
Time, and the perception of time, is often in the background guiding actions
and behavior throughout the day. Accordingly, there has been a great deal of
interest in how organisms learn about and organize their behavior in time.
Time of day cues
Animals have an internal source of temporal information that corresponds
to the time of day. For example, Robert Bolles and Sam Moot (1973) isolated
rats in a room with constant dim light and fed them
every day at 10:00 a.m. and 4:00 p.m. The rats had
access to a running wheel throughout the day. They 100 Key 1
used their sense of time of day to anticipate each of Key 2
Other
the meals—that is, they became active in the wheel
an hour or two before each meal. Similarly, birds— 80
including pigeons (e.g., Carr & Wilkie, 1998; Saksida
Pecks per key (%)

& Wilkie, 1994) and garden warblers (e.g., Biebach,


60
Gordijn, & Krebs, 1989)—have been trained to peck
different keys or visit different locations at different
times of the day to obtain food (Figure 8.27). All the 40
results suggest that animals can use time of day cues
as CSs or occasion setters to predict or locate food.
Most animals, including humans, have a so- 20
called circadian rhythm that responds to the 24-
hour period. These rhythms are easily observed in
rodents with continuous access to running wheels. 0
A.M. P.M.
Rodents become active at night, and this tendency Time
persists even when experimenters keep the animals
Figure 8.27  Pigeons can use time of
under constant light, thus eliminating the usual day
day cues to set the occasion for which
and night cues. The animals appear to respond ac- of two keys will be reinforced. Before
cording to an internal clock that has a period of ap- the tests shown, pecks to Key 1 had
proximately 1 day (circadian roughly means “about been reinforced during morning ses-
a day”). Actually, under these “free-running” condi- sions, and pecks to Key 2 had been
tions, the rodent’s activity tends to begin a little later reinforced during afternoon sessions.
each day, as if the average clock has a period that (After Saksida & Wilkie, 1994.)
336  Chapter 8

is a bit longer than 1 day. Under normal conditions, the animal’s circadian
clock is “entrained,” or brought into phase with the actual day, by stimuli
like natural light. These entraining stimuli are called zeitgebers (“timegivers”
in German). You experience your own circadian clock when you become
jet-lagged after flying to a distant time zone. Your circadian clock is out
of phase with the new conditions, although exposure to local zeitgeibers
gradually entrains and shifts your clock to local time. The fact that an ani-
mal’s circadian clock can be shifted gradually is often used to study its role
in timing behavior. For example, in the research mentioned above (e.g., see
Figure 8.27), Saksida and Wilkie (1994) trained pigeons to peck different
keys in a Skinner box at two different times (about 9:00 a.m. and 3:30 p.m.)
each day. Then they shifted the lighting schedule in the birds’ home room
so that the artificial day began at midnight instead of 6:00 a.m. This shift did
not change the birds’ performance on the first day; thus, the birds were not
merely timing the interval between lights on and the test to set the occasion
for the correct key peck. When the birds were tested again 6 days later, how-
ever, after enough time for the clock to be entrained to the new photoperiod,
performance changed. The results suggested that the birds had learned to
use the circadian clock to peck the correct key.
As mentioned above, rats appear to use the circadian clock to anticipate
a daily meal when it is scheduled at the same time each day. Interestingly,
other intervals are more difficult to use. Bolles and de Lorge (1962) and Bolles
and Stokes (1965) found that rats anticipated meals spaced 24 hours apart,
but never learned to anticipate meals spaced 19 or 29 hours apart. Notice
that meals spaced 19 or 29 hours apart would be delivered at a different
time each day. One interpretation is that “preparedness” once again operates
here (see Chapter 2): The rats use a 24-hour circadian signal as a CS for food,
but not a cue that is incompatible with its natural clock (e.g., Boakes, 1997).
Interval timing
Animals are also good at timing intervals in the range of seconds and min-
utes. We have already seen some examples of this. In Chapter 3, we consid-
ered inhibition of delay, where responding in a CS starts slowly and then
peaks toward the end of the CS, when the US is usually delivered. The animal
times the duration of the signal and performs its behavior appropriately. In
Chapter 7, we also saw that the rate of operant behavior on a fixed-interval
schedule of reinforcement “scallops”—or anticipates—the point at which the
next reinforcer will be delivered. Unlike the circadian clock, which runs on
its own once it is entrained, these demonstrations suggest a flexible timing
process that can be turned on and off, almost like a stopwatch. Because of
this kind of observation, investigators have become extremely interested in
how animals perceive and use information about temporal intervals.
To study interval timing, experimenters have developed several meth-
ods that are clever extensions of the simple stimulus control experiment.
For example, in a temporal generalization procedure, an animal in a Skin-
ner box is reinforced if it responds after presentation of a signal of one dura-
How Stimuli Guide Instrumental Action   337

tion, but not after signals of other durations (e.g., Church & Gibbon, 1982).
When the subject is exposed to signals of various durations, there is a nice
generalization gradient in which the highest responding is observed after
the reinforced duration. Another important method is the peak procedure
(e.g., Catania, 1970; S. Roberts, 1981). In this method, animals receive many
trials in which a signal is presented and the first response after some fixed
interval (say, 20 seconds) is reinforced. As you might expect, the animal
begins to delay its responding until the reinforcer is expected in time. To
study the accuracy of the animal’s timing, the experimenter occasionally
introduces blank or “empty” trials in which the signal stays on for a much
longer period and the animal receives no reinforcement. The animal shows
a “peak” of responding at the point in time when the reinforcer was oth-
erwise expected, and then responding declines. The top parts of Figure
8.28 illustrate responding on empty trials for different groups of rats that
had been reinforced 20 seconds or 40 seconds into the signal (Roberts,
1981). Clearly, the animals have learned to expect reinforcers at roughly
20 seconds or 40 seconds into the signal.

60 60
Rate (resp./min)

30 30

0 20 40 60 80 0 20 40 60 80
Time (s)

60
Rate (resp./min)

30

0 1.0 2.0
Proportion of
peak time

Figure 8.28  Results from the peak procedure. Top: Rats were reinforced for the
first response 20 seconds or 40 seconds after the start of a signal. In the “empty”
trials shown, the reinforcer was not delivered, and the signal stayed on longer.
Notice that the response rate peaked around the time when responding had
been reinforced, but that timing was more accurate with the shorter 20-second
interval (there was less spread around the peak). Bottom: The same data with
the x-axis now showing time as the proportion of the timed interval. Notice that
the two curves almost lie on top of each other—they “superpose.” (Data from
Roberts, 1981; after Shettleworth, 1998.)
338  Chapter 8

It is interesting that the shapes of the response curves are similar for the
20-second and 40-second intervals. Timing is reasonably accurate in both
cases, but not perfect. Notice, though, that the spread of responding around
the peak at the 40-second interval is quite a bit broader than the spread
around the 20-second interval. This difference illustrates an important
fact about timing: Longer intervals are timed less accurately than shorter
intervals. One of the fascinating facts about timing is that the spread of
responding around the peak is lawfully related to the interval that is being
timed. This property of timing is illustrated by the lower graph in Figure
8.28. Here, response rate is plotted as a function of the proportion of the
timed interval, rather than absolute time, on the x-axis. Thus, for animals
that were timing the 20-second interval, “1.0” is 20 seconds, whereas “0.5”
is 10 seconds and “2.0” is 40 seconds. For the 40-second interval, in contrast,
1.0 is 40 seconds, whereas 0.5 is 20 seconds and 2.0 is 80 seconds. What
happens here is that the two response curves look alike; they are said to
superpose. (The result is called superposition.) Regardless of the interval
that is being timed, on a given task the probability of responding is related
to the proportion of the time into the interval. This property of timing is
sometimes called the scalar property (e.g., Gibbon & Church, 1984). It is an
example of Weber’s law, a law in psychophysics that holds that perceived
differences are a constant proportion of the value being judged.
The scalar property of interval timing pops up wherever investigators
have looked for it. It showed up in the temporal generalization procedure
mentioned earlier (Church & Gibbon, 1982). It also shows up in inhibition of
delay (Figure 8.29A; Rosas & Alonso, 1996) and in behavior on FI schedules
of reinforcement (Figure 8.29B; Dews, 1970). Regardless of the method used
to investigate timing, longer intervals are timed less accurately than shorter
intervals, and the amount of error is proportional to the length of the timed
interval. One way to think of it is that, on a given trial, the animal is compar-
ing elapsed time with its memory for the interval that has previously been
reinforced. If its decision to respond is based on the ratio of the current time
and the memory for time (which is the current proportion of the reinforced
interval shown in Figures 8.28 and 8.29), everything falls into place, and you
will get the kind of pattern shown in Figures 8.28 and 8.29.
The idea that animals compare the ratio of the current and remembered
time is also supported by research using a temporal bisection procedure.
Church and Deluty (1977) presented rats in a Skinner box with a cue that was
either 2 seconds or 8 seconds long. Immediately after the cue was presented,
two levers were inserted into the chamber. If the cue had been 2 seconds long,
pressing the left lever was reinforced; if the cue had been 8 seconds long,
pressing the right lever was reinforced. The rats learned this discrimination.
What was interesting, though, were the results of test trials in which the
experimenters presented cues that were between 2 seconds and 8 seconds in
duration. Generally speaking, responding changed gradually as a function
of the duration of the test signal. The test signals that were shorter than 4
seconds long were judged to be short—the rat tended to press the left lever
How Stimuli Guide Instrumental Action   339

(A) Inhibition of delay (B) FI scallop


0.5 1.0
US FI schedule
50 s 3000 s
100 s 300 s
0.4 0.8

Rate (proportion of terminal rate)


150 s 30 s
200 s
Mean suppression ratio

0.3 0.6

0.2 0.4

0.1 0.2

0 0
0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0
Time Time
(proportion of CS-US interval) (proportion of fixed interval)

Figure 8.29  Other examples of superposition in interval timing. (A) Inhibition


of delay in conditioned suppression in rats. Rats received an electric shock
US at the end of a tone CS that was 50, 100, 150, or 200 seconds long. Sup-
pression increased during later parts of the CS—as a function of relative time
in the CS. (B) Responding of pigeons on different fixed interval (FI) schedules
of reinforcement. Different groups received FI 30-second, FI 300-second, and
FI 3,000-second reinforcement schedules. Remember that in such schedules,
the first response after a given interval since the last reinforcer is reinforced. FI
schedules cause “scallops” in responding—the increase in response rate over
the fixed interval shown here. Once again, response rate is a consistent func-
tion of relative time in the interval. (A, after Rosas & Alonso, 1996; B, data from
Dews, 1970, after Gibbon, 1991.)

after each signal—and cues that lasted more than 4 seconds were judged to
be long (the rat tended to press the right lever after each signal). The 4-second
cue, however, was the psychological middle; the rats judged it to be long or
short (pressing left or right) with about equal probability. One way to think
of it is that there was equal generalization to 4 seconds from the 2-second
(“press left”) and 8-second (“press right”) cues. For that to occur, the animal
was once again comparing the ratio of time in the test to remembered time
(4:2 is the same as 8:4). Essentially, the same results were obtained when the
experiment was repeated with 4- and 16-second cues (the middle point being
8 seconds), 1- and 4-second cues (the middle point being 2 seconds), and 3-
and 12-second cues (the middle point being 6 seconds). The animal always
behaved as if it were comparing ratios. Similar results have been obtained
with humans (Allan & Gibbon, 1991). In timing tasks, organisms seem to
respond to the ratio of current time and remembered time.
340  Chapter 8

How do they do it?


Theorists have often assumed that humans and other animals use a kind of
internal clock to estimate time. The most influential approach to timing is
the information processing approach developed by John Gibbon and Rus-
sell Church (e.g., Gibbon, Church, & Meck, 1984). They proposed a system
that fits nicely with the standard cognitive model discussed throughout this
book. The model is illustrated in Figure 8.30. It supposes that timing in-
Russell Church volves a clock, working and reference memory, and a decision process. The
clock has several interesting parts. First, there is a “pacemaker”—a hypo-
thetical device that emits pulses at a lawful rate. When an event occurs that
starts the clock (like the onset of a to-be-timed signal), a “switch” is closed
that allows pulses from the pacemaker to collect in an “accumulator”—the
animal’s working memory for elapsed time. Elapsed time is represented
as the number of pulses that have accumulated, or n. When the animal is
reinforced after some duration, the system stores an approximation of the
current value of n in reference memory; the reinforced duration in reference
memory—or, essentially the expected time to a reinforcer—is n*. In any
John Gibbon timing task, the animal continuously compares its working memory for
the amount of time that has elapsed in the current trial (n, in the accumula-
tor) with a reference memory for the expected time to the reinforcer (n*).
In the model, the comparison is made by a “comparator” that generates a
response when the difference between n and n* gets smaller than a certain
value (b; see Figure 8.30). (Notice that the difference between n and n* be-
comes smaller as n approaches n*.) Error in the accuracy of timing can be

Signal

Clock Pacemaker Switch Accumulator

Working Reference
Memory
memory memory

n n*

Comparator
Decision n*– n
< b?
n*

yes
Figure 8.30  An information processing model of timing (e.g.,
Gibbon et al., 1984) has clock, memory, and decision components. Respond
How Stimuli Guide Instrumental Action   341

introduced in a number of ways, most notably because the pacemaker var-


ies in rate slightly from trial to trial (Gibbon & Church, 1984). But because
the crucial difference between time in working and reference memory is
expressed as a proportion of the expected interval to reinforcement (n*; see
Figure 8.30), the scalar property is obtained.
The information processing model of timing has had a large influence
on timing research. It is intuitively plausible, and it seems to fit how psy-
chologists think perception and memory relate to one another. In addition,
investigators have used a number of experimental manipulations to separate
the clock and memory systems in a way that is consistent with Figure 8.30
(e.g., Church, 1989; Meck, 1996; Roberts, 1981). For example, giving rats
methamphetamine (“speed”) seems to speed up the pacemaker (e.g., Meck,
1983). This idea is suggested by several intriguing results. If rats have already
been trained on a timing task, introducing the drug initially makes the rats
judge timed intervals as longer than they really are; it is as if the pacemaker
is revved up and pulsing quickly. But if extensive training is then conducted
while the animal is drugged, the system stores an abnormally large number
of pulses (n) as the expected interval to reinforcement (n*). Timing becomes
accurate under the influence of the drug. If the rat is now tested without the
drug, it behaves as if its clock is running slow—pulses now accumulate at
the normal rate, but it takes more of them to accumulate before the animal
reaches the remembered interval! These results are consistent with the clock
and memory functions built into the information processing model.
Russell Church and Hilary Broadbent later suggested another way to
conceptualize the internal clock (Church & Broadbent, 1990). They got
rid of the pacemaker and accumulator and instead suggested that the
clock might consist of a group of “oscillators”—that is, units that switch
between two different states at a regular rate over time (see Gallistel,
1990). The idea behind their multiple oscillator model is sketched in
Figure 8.31. Each of the curves represents the status of an oscillator that
Oscillator state

Figure 8.31  How multiple


oscillators can tell time. Each of
several oscillators cycles through
values of +1 and –1 at a different
rate. Any given point in time can
be represented by reading the
T1 T2 T3 current status of all the oscilla-
Time tors. (After Shettleworth, 1998.)

Au: Vertical dashed lines were added to T1 and T2 to match T3. OK as drawn?
342  Chapter 8

varies between values of +1 and –1, with a fixed period. Each oscillator
changes at a different rate. Any point in time can be represented by listing
the status of each of the different oscillators. For example, at Time 3, the
oscillators from top to bottom have values of – – + – +. Notice that their
values are different at Time 1 and Time 2. Each readout can thus label a
different point in time. In this model, the start of a trial starts the oscil-
lators going; the current readout of the oscillators is what is in working
memory, and a remembered readout is in reference memory. A model
with a multiple oscillator can do a good job of simulating the findings
of the timing literature (e.g., see Church & Broadbent, 1990; Wearden &
Doherty, 1995). It also has a number of advantages, one of which is that
it seems more plausible from a biological perspective. Oscillators like the
ones supposed here are thought to be in control of repetitive activities like
licking, breathing, and circadian rhythms. In contrast, it is hard to find
a biological event that corresponds to a switch or an accumulator. More
important, unlike the pacemaker–accumulator, a clock made up of oscil-
lators is likely to time certain intervals with greater precision than other
intervals. This is because the readout from the oscillators does not change
in a perfectly linear way over time—sometimes a small change in time
will correlate with a relatively big change in readout from the oscillators
and vice versa. Consistent with this possibility, Crystal and others have
found that rats can time certain intervals especially accurately (Crystal,
1999, 2003, 2012; Crystal, Church, & Broadbent, 1997).
It is also possible to explain timing behavior without appealing to
an internal clock at all. Staddon and Higa (1999) have proposed that the
start of the trial is recorded in short-term memory and that the strength
of this memory then fades systematically over time. Animals can learn
to time a reinforcer by merely associating the reinforcer with the current
strength of the memory from the start of the trial. (For reasons that there
is not space to explain, Staddon and Higa’s model is called the multiple-
time-scale model.) Alternatively, in the behavioral theory of timing
(Killeen & Fetterman, 1988; see also Machado, 1997), animals are assumed
to go through a fixed sequence of behaviors during any to-be-timed in-
terval. Animals can time the reinforcer by learning what behavior they
were doing when reinforcement was delivered. Neither of these views
supposes an internal clock, but both can account for many features of
timing, including the scalar property illustrated in Figures 8.28 and 8.29.
Staddon and Higa explain it by assuming that the memory of the start of
the trial fades according to a logarithmic function that causes its strength
to decrease rapidly at first and then slower and slower over time. Thus,
short intervals can be timed more accurately than long intervals. Killeen
and Fetterman (1988) assume that animals go through the sequence of
different behaviors more rapidly when shorter intervals are being timed.
It may be unnecessary to link timing so rigidly to overt behaviors, though;
attempts to find the sequence required have not always been successful
(Lejeune, Cornet, Ferreira, & Wearden, 1998). The behavioral theory of
How Stimuli Guide Instrumental Action   343

timing has also been challenged by other findings (e.g., see Bizo & White,
1995; Leak & Gibbon, 1995).
To summarize, research on interval timing illustrates that animals
readily use the passage of time as a cue to predict the presentation of
reinforcers, and experiments from many methods have produced compat-
ible and lawful results. The results clearly suggest that animals can use
some cognitive or behavioral variable that correlates with the passage of
time to make judgments about it. There has been disagreement, though,
about what the temporal variable is. A common assumption has been
that animals represent time with an internal clock, although this idea has
not been accepted by everyone. Although the pacemaker–accumulator
mechanism used in scalar expectancy theory has been highly influential,
a number of important questions have been raised about it (e.g., Crystal,
2003; Staddon & Higa, 1999), and other interesting ways to represent time
are possible. A great deal of recent thinking about interval timing has
been further influenced by investigations of the brain mechanisms that
underlie it (e.g., Allman, Teki, Griffiths, & Meck, 2014; Buhusi & Meck,
2005; Matell & Meck, 2004).

The Cognition of Space


Cues that guide spatial behavior
Most animals (including humans) need to find their way through space
to reach goals like food, mates, shelter, and so forth. Thus, instrumen-
tal behavior occurs in the context of space as well as time. How animals
use spatial information has interested psychologists at least since Tolman
first proposed the role of “cognitive maps” (see Chapter 7). Although we
already know from Tolman’s experiments that rats are good place learn-
ers, whether they represent their environment as a kind of global map in
memory is still controversial. The idea has its proponents (see pp. 344–346),
but research has often pursued the more modest goal of figuring out what
kinds of cues are used when we get around in space—and how we go
about using those cues.
Sometimes navigation can occur without immediate reference to any
cues in the environment at all. In dead reckoning, the animal uses an
internal sense of direction and distance to get from one location to a goal
in a straight line. For example, golden hamsters can be led in the dark to
a location away from the nest by moving them a few feet in one direction,
turning, and then moving in another. They return directly to the starting
point by somehow integrating information about the outward paths (e.g.,
Etienne, 1992; Etienne, Berlie, Georgakopolous, & Maurer, 1998). Other-
wise, animals do use available cues to navigate to a goal. Sometimes they
can use beacons—that is, cues that are very close to the goal that the animal
can see at a distance and simply approach. You are using a beacon when
you reach a gas station or the exit of a building by spotting and approaching
a sign. Learning about beacons is presumably another example of stimulus
344  Chapter 8

learning—the animal learns that the stimulus is associated with the goal
and then approaches it, as in sign tracking (see Chapter 2).
Often, though, beacons are not positioned so conveniently next to the
goal, and the animal must therefore use landmarks—that is, cues that
have a fixed relationship with the goal but are not very close to it. The
extra-maze cues that a rat uses to get around a radial maze are one ex-
ample. You will remember that the rat uses configurations of these cues,
rather than just a single cue, to identify locations on the maze (e.g., Suzuki
et al., 1980). Other research has built on this point. For example, Mar-
cia Spetch, Ken Cheng, Suzanne MacDonald, and others (1997) trained
pigeons to find food that was buried in sawdust on the floor. The loca-
tion of the food was consistently in the middle of four bottles that were
arranged as four corners of an imaginary square (Figure 8.32A, upper
panel). The birds readily learned to find the food using the landmarks.
When Spetch et al. (1997) expanded the arrangement of the bottles dur-
ing the test trials (see Figure 8.32A, bottom panel), the birds continued to
search in the usual position, a fixed distance from one of the landmarks,
instead of searching at the “middle” of the expanded square. Such results
suggests that the birds were using the configuration of landmarks, but
only for direction—rather than distance—information (see also Spetch,
Cheng, & MacDonald, 1996).
Spetch et al. (1996, 1997) also tested humans in related experiments. For
example, in an experiment conducted outdoors in a grassy field, people
were asked to find a small object in the middle of four landmarks provided
by upright pool noodles. When the 6-by-6-meter “square” was expanded,
the people always searched in the “middle” of the shape rather than a fixed
distance from one landmark as the pigeons had (Figure 8.32B). Pigeons
and other birds can be trained to use “middle” if the absolute distance be-
tween two landmarks is made irrelevant by separating them by different
distances on different trials (e.g., Jones, Antoniadis, Shettleworth, & Kamil,
2002; see also Kamil & Jones, 2000; Spetch, Rust, Kamil, & Jones, 2003). So,
the system is flexible, but there is a natural tendency for many species to
use a configuration of landmarks to indicate direction from an individual
landmark (see Kelly & Spetch, 2012, for a review).
The idea that organisms might use actual shape or geometric infor-
mation has been an important issue in spatial learning research. In some
influential experiments, Ken Cheng (1986) had rats locate food (Cocoa
Puffs) in a small (120 cm by 60 cm) rectangular box with unique walls and
corners (Figure 8.33). The apparatus was housed in a black environment
so that extra-maze cues could not be seen and used to get around. In one
set of experiments, the rats could always find food in a consistent corner
of the box. They were reasonably good at finding it, which is not terribly
surprising. Cheng found, however, that the rats often made “rotational
errors” by going to the diagonally opposite corner of the rectangle. For
example, in the test summarized in Figure 8.33A, rats went to the correct
place 71% of the time, but they also went to the opposite corner 21% of
How Stimuli Guide Instrumental Action   345

(A) Pigeons (B) Humans


Control Control
Landmark
75 Single peak 3
50 Primary peak
Secondary peak 0
25
0 –3
–25
Location in up-down dimension (cm)

–3 0 3

Location in up-down dimension (m)


–50
–75
Landmark
Response
–75 –50 –25 0 25 50 75

Diagonal expansion Diagonal expansion test

75 6
50
3
25
0 0
–25
–3
–50
–75 –6

–75 –50 –25 0 25 50 75 –6 –3 0 3 6


Location in left-right dimension (cm) Location in left-right dimension (m)

Figure 8.32  Performance of (A) pigeons and (B) humans taught to find rewards
in the middle of four landmarks arranged in a square. Top: The configuration of
landmarks in training and location of searches in the middle. Bottom: Results
of
Q: test trials
Should after
“0” on the
each square
graph havecreated by theand
a solid vertical landmarks
horizontal was
blackexpanded.
line? Pigeons
searched in the same position a fixed distance from an individual landmark,
whereas humans continued to search in the center of the expanded square.
(After Spetch et al., 1997.)

the time. When the rats were tested after the nearby corners and distinct
walls were removed (replaced with black), the preference for the opposite
corner was even more extreme; they now went to the correct corner 47%
of the time and the diagonally opposite corner 53% of the time (see Fig-
ure 8.33B). The increase in errors that occurred when the local cues were
removed suggests that the rats had been using these cues to some extent.
The persistent choice of the opposite corner was especially impressive and
interesting, however.
Cheng (1986) noted that the diagonal corners are geometrically equiva-
lent (e.g., both corners had a long wall on the right and a short wall on the Ken Cheng
346  Chapter 8

(A) (B)

71 6 47 0

2 21 0 53

Figure 8.33  Apparatus and results of experiments reported by Cheng (1986).


(A) Food was hidden in the location indicated by the red dot. Rats were accu-
rate in finding the food (numbers indicate percentage of responses there), but
often made errors—“rotational errors”—by going to the diagonally opposite
corner of the box. (B) When rats were tested after cues from nearby walls and
corners were removed, they went exclusively to the two geometrically equiva-
lent corners. Perhaps they were using global information about the shape of
the environment to find the food (see also Figure 8.38). (After Cheng, 1986.)

left); therefore, the rats were using geometric information. He suggested


that the rats were using a global representation of the shape of the box to the
find the food and that there might be a geometric module in their brains
that is exclusively dedicated to encoding the shape of the environment
(see also Gallistel, 1990). Shape was assumed to be stored and remembered
separately from any landmarks. Whether shape is truly stored separately
is debatable at the present time (see Cheng & Newcombe, 2005; Kelly &
Spetch, 2012), but similar experiments in rectangular environments suggest
that rotational errors often occur in a number of species, including pigeons
(e.g., Kelly & Spetch, 2001), fish (e.g., Sovrano, Bisazza, & Vallortigara,
2003), and young children (Hermer & Spelke, 1994), among others (see
Cheng & Newcombe, 2005, for a review). These findings suggest that a
geometric module may be generally important. The possibility that animals
learn about the global shape of an environment seems especially consistent
with the idea that they learn a kind of map (but see pp. 353–355).
Spatial learning in the radial maze and water maze
Although experimenters have used a wide variety of methods for studying
spatial learning in animals, two laboratory tasks have become especially
common. The first is the familiar radial maze. As you already know, rats
clearly use cues at the end of the maze arms to get around efficiently in this
environment. In addition, when experimenters use electrodes to measure
activity in cells in the hippocampus, the cells appear to fire only when the
rat is in a particular place on the maze (e.g., O’Keefe & Speakman, 1987).
The discovery of such place cells encourages the view that the brain (the
How Stimuli Guide Instrumental Action   347

hippocampus, in particular) is wired and organized to detect an animal’s


place in space.
There are at least two ways to think about how the rat represents space
in this task. One is that it might learn a mental map and somehow “mark”
each location on the map after it has been visited (e.g., Roberts, 1984).
Another possibility is that locations might be stored as items in a list (e.g.,
Olton, 1978). As the rat moves around the center of the radial maze, it might
look down an arm and make a decision about whether or not to visit that
particular arm. The decision would be guided by memory, of course, but
it would not necessarily be in the form of a map. There is evidence that the
list-type of representation occurs. Michael Brown (1992) noticed that rats
on the center of the maze would first orient toward an arm and often stop
before deciding to enter or reject it. He called these orienting responses
“microchoices” and discovered that they occurred at different arms more
or less randomly—that is, the rats did not appear to be guided by a map.
In subsequent experiments, however, Brown, Rish, Von Culin, and Edberg
(1993) forced the rat to make a few choices before removing it from the
maze. While the rat was away, they put an opaque cylinder around the
center of the maze. When the rat was returned to the maze, it was put
inside the cylinder and now had to push a door open to look down any
arm. Even though the rats could not see the arms or locations at the ends
of the arms, they still preferred to open doors to those arms that they had
not previously visited on the first trial. Some kind of memory was thus
guiding their “blind” choices within the cylinder, and Brown et al. (1993)
suggested that it might have been a map. In contrast, when extra-maze
cues are easier to view from the center, it might be easier to depend on the
microchoice strategy.
Another laboratory maze is the water maze, first developed by Richard
Morris (1981). The water maze is typically a circular pool of water, 1 to 2
meters in diameter, that is situated in a room (Figure 8.34A). The water
in the pool is made opaque with a substance that gives it a milky quality.
The rat is required to swim in the pool to find a submerged platform that
it can stand on to keep its head above the water. If the platform is marked
by a beacon, such as a stick protruding from the water next to it, the rat
readily learns to approach the beacon to reach the platform. But if there is
no beacon, the rat is forced to use landmarks from the room outside the
pool to locate the hidden platform efficiently.
In one of his original experiments, Morris (1981) gave rats training
trials in which they always began from a start point at the west side of a
tank to find a hidden platform in the northeast. (There was no beacon in
this experiment.) On a test trial (summarized in Figure 8.34B), one group
of rats (the Same-place group) was started from a point they had never
started from before (i.e., north, east, or south), but the platform was in the
same place as usual. As shown in the top row of Figure 8.34B, these rats
had no trouble finding the location of the platform despite the change in
starting point. A second group (the New-place group) was also started from
348  Chapter 8

(A)

(B)
1 2 3 4 5 6

Same-place

New-place

Control

Figure 8.34  (A) A rat in a typical water maze. (B) Paths taken by individual rats
on test trials in the water-maze experiment by Morris (1981). Rats in the Same-
place group found the hidden platform (small circle) easily, even though they were
started from a new location on the perimeter of the pool. Rats in the New-place
group continued to search in the area where the platform had been located previ-
ously, before it was moved. In either condition, the rats had to use landmarks from
the room outside the pool to locate the hidden platform. Control rats were started
from their usual starting place. (A, after Morris et al., 1982; B, after Morris, 1981.)
How Stimuli Guide Instrumental Action   349

new locations, but their platform had been deviously moved. As shown
in Figure 8.34B, they eventually found the new platform location, but first
searched for a while in the original place. A control group had a test that
was no different from what it had learned at the start; their performance
held no surprises. Morris also soon showed that rats that had lesions of
the hippocampus were very bad at learning this task (e.g., Morris, Gar-
rud, Rawlins, & O’Keefe, 1982). In the years since these early experiments,
there have been so many other experiments that have shown that rats with
lesions of the hippocampus do poorly in the water maze that poor per-
formance has become a kind of litmus test of a good hippocampal lesion.
How do they do it?
Although the research just described tells us what kinds of cues animals
use when they get around in space, it has not told us much about how they
actually learn about those cues. Similarly, although Tolman talked about
maps and provided evidence that animals might learn about mazes with-
out reinforcement (e.g., the latent learning experiment), he said relatively
little about how maps were actually learned. It has been fairly common to
assume that spatial learning follows unique learning principles. Gallistel
(1990, 1994), for example, has noted a number of unique mental computa-
tions that would be required for an animal to construct a map. And, in an
influential book, O’Keefe and Nadel (1978) distinguished between two
memory systems: what they called the “locale system,” which involved
learning and using a cognitive map through exploration; and the “taxon
system,” which involved learning routes through space and S-R connec-
tions. Classical conditioning and most other types of learning that we have
considered were thought to be examples of the taxon system but not the
locale system, which instead was devoted to spatial learning and mapping.
Exploration that allows the animal to connect different parts of the en-
vironment seems to be important in spatial learning. For example, Suther-
land, Chew, Baker, and Linggard (1987) had rats learn to find a hidden
platform in a water maze. Some rats were able to swim freely around the
maze and observe the extra-maze cues from all vantage points. Other rats,
however, could only see half the room at any one time because a black
curtain was hung over the middle of the water maze so that it bisected the
pool. When the rats were tested from a new starting point, the rats that
had been allowed to view the room from all vantage points did better than
those that had not. Similar results have been reported by Ellen, Soteres,
and Wages (1984), who had rats learn about an elevated maze like the one
shown in Figure 8.35. Three different tables were connected by open run-
ways that joined in the center in a Y configuration. Wooden screens were
positioned on the three tables so that observation from one table to the
others was obstructed, except for a small entrance. On test trials, the rats
were allowed to eat a little food on one of the tables, and then they had to
find their way back to that table after being put on another table. Rats that
had been allowed to explore the runways and connections between all three
350  Chapter 8

Figure 8.35  Apparatus used by Ellen et al. (1984). (After Shettleworth, 1998.)

tables did better than those that had been allowed to explore only one or
two tables (and their runways) at a time. Unfortunately, this experiment
did not control for the amount of exposure to the tables or practice walking
on the narrow runways (the experimental groups had had more practice).
Preliminary results, however, do suggest that exploring the links between
places is important for being able to move efficiently through a larger space.
There is also growing evidence that animals may learn to find goals in
spatial tasks through familiar learning principles. For one thing, a lot of the
learning that occurs when rats locate a hidden platform in the water maze
may involve the rat learning to go in a general direction rather than toward
a specific spatial location. When Derek Hamilton and his colleagues trained
rats to find a hidden platform in a water maze, the rats still headed in the
same general direction after the maze was moved, even if it meant going
to a new, and incorrect, absolute location within the room (e.g., Hamilton,
Akers, Weisend, & Sutherland, 2007; Hamilton et al., 2008; see also Blodgett,
McCutchan, & Matthews, 1949; Skinner et al., 2003).
When a beacon is present to signal the platform’s location, the rat might
also learn about it and landmarks in the room through familiar principles.
For example, Roberts and Pearce (1999) demonstrated a blocking effect
(see Chapters 3 and 4), suggesting that spatial learning might involve the
kind of cue competition that we know so well in other types of learning.
The design of one experiment is shown in Figure 8.36A. In Phase 1, rats
were trained to find a submerged platform located near a beacon in a water
How Stimuli Guide Instrumental Action   351

(A)
Group Phase 1 (No room cues) Phase 2 Test
Block 8 Beacon — Platform 4 Beacon + Room — Platform Room?
1 session control 1 Beacon — Platform 4 Beacon + Room — Platform Room?
No training control ——— 4 Beacon + Room — Platform Room?

Figure 8.36  Design (A) and test results (B) of


(B)
an experiment on blocking in the water maze. In
30 Phase 1, rats in the Block group were given eight
Block sessions in which they learned to find a hidden
25 1 session control platform near a beacon. Room cues were made
Time in correct quadrant (s)

No training control
unavailable by a curtain surrounding the maze. In
20 Phase 2, the beacon was still present while the rats
could learn the platform’s position with respect to
15 room cues (the curtain was removed). In the final
test, the room cues were present but the beacon
10 was not. Rats in the Block group spent little time
searching for the platform in the correct quadrant
5 of the pool. Learning about the beacon in Phase 1
thus blocked learning about the extra-maze room
cues in Phase 2. (B, after Roberts & Pearce, 1999.)

maze. A curtain around the maze eliminated external room cues. Control
groups received none of this training or only a single session (which was
not expected to allow much learning about the beacon). In Phase 2, the
curtains around the pool were opened so that the rats could now see cues
from the surrounding room. At this point, the groups all received several
sessions in which they swam to the hidden platform (marked by the bea-
con) in the presence of all the room cues. Notice that in the Block group,
the room cues were redundant to the beacon in helping find the platform.
The beacon might therefore have blocked learning about the room cues
in this group.
To test this possibility, the experimenters finally allowed the rats to
swim in the pool in the presence of the room cues with the beacon and
platform removed. The results are shown in Figure 8.36B. The Block
group showed more or less directionless behavior; the rats did not spend
much time swimming in the quadrant of the pool where the platform had
been. The control groups, however, searched near the former platform
location; they had clearly learned more about the room cues. The beacon
had thus blocked learning about the room cues in the Block group. In a
similar experiment run in Richard Morris’s lab (Biegler & Morris, 1999),
blocking occurred on dry land when landmarks standing on the floor of a
sawdust-covered arena signaled the location of food. In this case, when the
352  Chapter 8

to-be-blocked cue was added at the start of the compound phase (Phase 2),
there was evidence that the rats noticed it—they went over and sniffed it
and investigated. Thus, blocking in spatial learning is not simply due to a
failure to notice the added cue. Competition between landmarks in spatial
learning tasks has now been observed by several investigators using sev-
eral methods (Redhead, Roberts, Good, & Pearce, 1997; Rodrigo, Chamizo,
McLaren, & Mackintosh, 1997; Spetch, 1995; see also Chamizo, Sterio, &
Mackintosh, 1985, for evidence in a radial maze).
These results suggest that the rules that govern learning about beacons
and landmarks might not be entirely different from the rules that govern
associative learning as it is represented in Pavlovian conditioning. For
example, models like the Rescorla-Wagner model and others discussed in
Chapter 4 may apply (e.g., see Miller & Shettleworth, 2007, 2008). On this
sort of view, landmarks may compete with one another for association
with the goal the same way that CSs compete for association with a US.
And also consistent with this kind of view, other classic compound con-
ditioning phenomena have been demonstrated. For example, if the corner
of a rectangular pool signals the presence of a platform but the platform
is missing on trials when an extra cue is stuck to the wall, the extra cue
becomes a conditioned inhibitor, essentially signaling that the platform is
not there (e.g., Horne & Pearce, 2010). The inhibitory cue has several key
features that classically conditioned inhibitors are known to have (see also
Sansa, Rodrigo, Santamaria, Manteigia, & Chamizo, 2009). Once again,
there seems to be some generality to the laws of learning derived from
studies of conditioning.
There may be limits to this conclusion, however. Investigators have
had some trouble demonstrating that beacons can block learning about
geometric cues, as opposed to landmarks and other beacons. For example,
Pearce, Ward-Robinson, Good, Fussell, and Aydin (2001) first trained rats to
swim to a hidden platform in a circular pool by using a beacon. (Extra-maze
cues were again made unavailable by a curtain around the maze.) Then,
the experimenters used the beacon to signal a platform near a corner of a
“triangular” shape, which was created by suspending two plastic walls in
the pool (Figure 8.37A). Because extra-maze cues were again eliminated
by putting a curtain around the pool, the training consisted of a compound
of a previously trained beacon and a new geometric shape. When rats were
then tested in the triangle without the beacon, they searched the location
of the platform as much as the control had. That is, there was no evidence
that the beacon had caused blocking! Additional experiments strongly sug-
gested that the blocking failure was not because the beacon was too weak
or not salient (Hayward, Good, & Pearce, 2004). More recent evidence,
however, suggests that the natural salience of many geometric cues can
simply make it difficult to compete with them. When similar experiments
were conducted in a rhomboid shape (Figure 8.37B), the acute-angled
corners were found to be more salient predictors of the platform than the
How Stimuli Guide Instrumental Action   353

(A) (B)
Acute

Obtuse

Platform

Figure 8.37  (A) Walls (straight lines) inserted in the typical water maze intro-
duce strong geometric cues. Learning about geometric information (e.g., the
platform is near a corner) is difficult for a beacon to block. (B) If a rhomboid
shape is used instead, geometric cues provided by the obtuse corners are less
salient than the ones provided by acute corners. Beacons compete better with
obtuse corners.

obtuse-angled corners (Kosaki, Austen, & McGregor, 2013). The beacon


was then shown to be better at competing with the obtuse corner than
the more-salient acute corner. This finding is consistent with conditioning
models like the ones discussed in Chapter 4 because salient cues should
be harder to compete with than less salient cues.
Another point of view is also possible. Any evidence that beacons do
not compete with geometric cues could be consistent with the possibility
that animals learn about the shape of the environment in a “geometric mod-
ule” that is not influenced by information from other cues (Cheng, 1986;
Gallistel, 1990). As described earlier, the idea has been that organisms that
find goals in rectangular environments learn a global representation of the
space (Cheng, 1986; Gallistel, 1990). The main evidence for such a global
representation is that in an environment like the one at the left in Figure
8.38A, many organisms make rotational errors—if the reward or goal is at
Corner A, they often incorrectly go to Corner C. Pearce, Good, Jones, and
McGregor (2004) thought about the explanation of this finding. They noted
that although the confusion between Corners A and C was consistent with
the idea that the rat had a global geometric representation of the space, it
was also possible that the mistakes were guided by purely local cues. For
example, the rat might simply learn to approach a corner where a short wall
is to the left of a long wall, or the animal might merely learn to approach
a long wall and then turn left. Either of these possibilities would get the
animal to Corner A or Corner C equally often. But neither choice requires
the rat to learn a global representation of the shape of the environment.
Pearce et al. (2004) ran an ingenious experiment that tested these ideas.
They suspended large walls in a pool so that the walls created a rectangle
that the rat could be confined in. There were no features or landmarks on
354  Chapter 8

(A) Rectangle Kite


A B G

H
F
D C

(B) Kite test phase


100
Mean percentage of trials that were correct

Consistent group (corner E correct)


Inconsistent group (corner G correct)
80

60

40

20

0
1 2 3
Session

Figure 8.38  Environments (A) and results (B) of the experiment by Pearce et al.
(2004). Rats were first trained to find the platform at Corner A in the rectangle.
The rats went to Corner C just as often as they went to Corner A (rotational er-
rors, as in Figure 8.33), suggesting that they were using a global representation
of the rectangular environment. Such a representation would have been useless
in the kite, at right. In the second phase, however, which was conducted in the
kite, the rats that could find a platform at Corner E did much better than the
rats that could find the platform in Corner G. Corner E in the kite was consistent
with Corner A in the rectangle because both were to the left of a long wall (for
example). To get to Corner A or Corner E, the rats might have merely found a
long wall and turned left. They were using local cues rather than a global repre-
sentation of the environment. (After Pearce et al., 2004.)

the walls, and a featureless curtain surrounded the maze (so, once again,
extra-maze cues were not available). The rats had to learn to find a sub-
merged platform near Point A. The rats showed the usual rotational errors:
At the end of training, they swam to Corner A at the start of 44% of the
trials and to Corner C at the start of 45% of the trials. In a second phase,
How Stimuli Guide Instrumental Action   355

the walls were rearranged to make a kite shape (see Figure 8.38A, right).
If the rats had learned to respond in the first phase according to the global
representation of a rectangular environment, initial training should have
had little effect on responding in the kite configuration. Notice, though,
that Corner E was otherwise the same as Corners A and C—all were right
angles, with a short wall to the left of a long wall. If the rat had learned to
respond to local cues, there were grounds for thinking that it might find a
hidden platform in the kite configuration if the platform were positioned
near Corner E.
In fact, there were two groups in the Pearce et al. (2004) experiment, and
each group received different treatments in the kite phase. The Consistent
group had to find the hidden platform in Corner E, the corner consistent
with the corner that had been correct in the rectangle. The Inconsistent
group instead had to find a platform at Corner G. The results are shown in
Figure 8.38B, which shows the percentage of trials on which the rat first
went to the correct corner in the kite phase. Both groups went to Corner E,
the “correct” corner for the Consistent group but an incorrect corner for the
Inconsistent group. They were thus responding to the local cues. Equally
interesting is that the rats went to Corner F (the apex of the kite) about as
often as they went to Corner E—as if they really had learned to go to a
long wall and turn left. The results clearly indicate that rats can behave as
if they have learned a global geometric representation by responding to
purely local cues.
These results, along with others (Esber, McGregor, Good, Hayward,
& Pearce, 2005; Horne, Gilroy, Cuell, & Pearce, 2012), raise new questions
about evidence suggesting that animals learn a global geometric represen-
tation of the environment. We are once again left with the conclusion that
simple learning rules, not unlike the ones we have discussed throughout
this book, may account for behavior in a wide range of situations. You
might be reminded of a theme that we have seen many times before: Al-
though many types of learning at first appear different from the types of
learning that we saw in earlier chapters, there is surprising generality to
the basic principles of learning.

Metacognition
At the start of this chapter, I noted that modern research on the stimulus
control of instrumental behavior often uses stimulus control techniques
to investigate cognitive processes in animals—the study of “animal cog-
nition” (e.g., Zentall & Wasserman, 2012). Research on spatial learning
along with many other topics we have covered in this chapter—such as
categorization, perception, attention, working memory, reference memory,
episodic memory, and timing—are all examples of work in this domain.
One observation is that the boundary between the processes that we in-
voke to explain animal versus human behavior is not always clear. That
is, the line between animal and human cognition has been blurred a little
356  Chapter 8

bit. Another point is that the basic learning processes we discussed in


earlier chapters often play an important role in understanding what look
like higher, more complex cognitive processes. Categorization and spatial
learning (among other things) are good examples of this. The line between
basic associative learning principles and seemingly higher-level cognitive
processes has become a little blurred, too.
It is therefore worth considering a final topic that seems to blur the
lines still further. Humans are generally aware of their own mental states
and can monitor them and report on them at will. Thus, I know when I am
confused by a technical article I am reading, and I can spend a little extra
time studying it. Similarly, I can tell you whether (or not) I remember the
names of the capitals of New Jersey, Kazakhstan, or Burundi. My ability
to monitor and report on my own cognitions is known as metacognition
(cognition about cognition). Several investigators have done some interest-
ing research to ask whether animals are also capable of it.
Robert Hampton (2001) reported evidence of metacognition in ani-
mals in an article entitled “Rhesus Monkeys Know When They Remem-
ber.” Monkeys sat in front of a touchscreen and participated in a delayed
matching-to-sample experiment. The method is sketched in Figure 8.39A.
At the start of a trial, a sample picture was presented; after a delay, four
comparison pictures appeared, and touching the picture that matched the
sample produced a tasty peanut reward. If the monkey touched the wrong
image, it received a 15-second timeout instead. The pictures changed from
day to day. Hampton also varied the interval between sample and test from
short (about 12 seconds) to very long (about 4 minutes). The monkeys often
forgot the sample over the longer delays. But here is the interesting part:
The monkey sometimes had an opportunity to decline to take the test, and
the results suggested that it did so mainly if it “knew” that it had forgotten
the sample. Here is how it worked. At the end of each delay, just before the
memory test, the monkey had to touch yet another picture that appeared on
the screen to accept and proceed to the test. On many of the trials, a second
picture also appeared, and this one could be touched to decline the test. If
the monkey declined, the trial ended, and a primate pellet, a less preferred
reward, was delivered. The pellet was less valuable to the monkey than a
tasty peanut, but it was certainly better than the timeout he would receive
if he responded incorrectly on the test. In fact, the monkeys often chose the
“decline” option on trials when the delay was long (Figure 8.39B). Their
memory performance was better on trials in which they actively accepted
the test compared to trials when they had no option to decline. The pattern
suggests that monkeys can learn to accept tests when memory is good and
decline them when memory is poor. The monkeys also (quite sensibly)
declined the test on occasional trials when no sample was presented at all.
The overall pattern of results suggested that monkeys were able to monitor
their memory and accept or decline the test in a way that improved their
chances of reward. They performed as if they could monitor their own
memory—a form of metacognition known as “metamemory.”
How Stimuli Guide Instrumental Action   357

(A)

Sample

Delay

Accept Decline

Forced
test

Test

Preferred peanut (if correct) Primate pellet

(B)

1.0 Accuracy during


accepted tests
Proportion of trials correct or declined

0.8 Decline Figure 8.39  (A) Illustration of the delayed


match-to-sample trials used by Hamp-
ton (2001). Monkeys were reinforced for
0.6 matching the sample (top) during the test
that occurred after a delay (left). Just be-
fore they took the test, they could accept
0.4 the test (by touching “Accept”); on some
Accuracy during trials (right), they could choose to decline
forced tests
0.2
it (by touching “Decline”). (B) Performance
of one of the monkeys in Hampton’s
experiment. As the delay interval between
0 the sample and test increased, the monkey
12.5 25 50 100 200 was more likely to decline the test (red
Delay interval (s)
line). Memory performance (accuracy) was
also better on trials when the monkey
chose to accept rather than decline (ac-
cepted tests) than when it had no choice
but to accept (“forced” tests). (After
Hampton 2001.)
358  Chapter 8

Other studies also suggest that monkeys will learn to decline a test if
their knowledge is not up to snuff. Basile, Schroeder, Brown, Templer, and
Hampton (2015) reported that rhesus monkeys in similar delayed match-
ing-to-sample experiments will also learn to request to see the sample again
(by making a touchscreen response) if their memory was arguably weak
before the test (e.g., after a long delay or when no sample was presented
at the start of the trial). Similarly, J. David Smith, Michael Beran, David
Washburn, and their colleagues have reported a number of experiments in
which animals also decline to respond when a discrimination between two
stimuli gets difficult (e.g., Smith et al., 1995; for reviews, see Smith, Beran,
& Couchman, 2012; or Smith, Couchman, & Beran, 2014). For example,
rhesus monkeys were trained to discriminate between squares that were
either filled with many dots (“dense” squares) or relatively few of them
(“sparse” squares) (Smith, Beran, Redford, & Washburn, 2006; Smith, Cou-
tino, Church, & Beran, 2013). When a dense square appeared, the monkey
was rewarded for touching a D on the screen; when a sparse square ap-
peared, it was reinforced for pressing an S. There was a 20-second timeout
if the monkey was incorrect. A wide range of different densities were actu-
ally trained; D was reinforced if the square was denser than the middle
point, and S was reinforced if the square was sparser than the middle point.
Densities in the middle were the most difficult to categorize. The monkey
could also make an “uncertain” response (aptly touching a ? on the screen!)
that ended the trial and started a new one. The monkeys learned to touch
the uncertain response, especially on trials in which they had to categorize
stimuli in the difficult center range. After ruling out several less interest-
ing explanations (see also Basile et al., 2015), Smith et al. (1995; see also
Smith et al., 2012; Smith et al., 2014) argued that the monkeys sensed their
own uncertainty on difficult trials and opted out appropriately. They thus
showed more evidence of metacognition.
How do they do it?
The question, as usual, is, How is this accomplished? One idea is that rhe-
sus monkeys can monitor and make fairly sophisticated use of their own
mental states, perhaps as human beings do, and respond accordingly (e.g.,
Smith et al., 2012; Smith et al., 2014). One problem with this approach is
that it does not really explain metacognition, or how the animal learns to
use it, at a mechanistic level; it mainly asserts that metacognition is there.
Another approach is that the animals learn to make “decline” or “uncer-
tainty” responses because subtle reinforcement contingencies reinforce
them for doing so (LePelley, 2012, 2014). For example, in Hampton’s task
(Figure 8.39A; Hampton 2001), strong or weak memories of the sample
provide SDs for accept and decline responses, respectively; in Smith et al.’s
tasks, timeouts for incorrect responding are aversive, and animals learn to
make uncertainty responses to avoid them. Smith and his colleagues have
argued that this type of approach is challenged by some of the methods
and findings in the literature (see Smith et al., 2014; but see also LePelley,
How Stimuli Guide Instrumental Action   359

2012, 2014). But the results with metamemory, for example, seem most
consistent with the idea that although rhesus monkeys can indeed learn to
report on the strength of their memory traces, “learning to pair strong and
weak memory states with two different responses would proceed according
to accepted associative principles” (Basile, Schroeder, Brown, Templer, &
Hampton, 2015, p. 100). As we have seen before, the general principles of
learning have not gone away.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. Stimulus control techniques provide a powerful method for studying
processes of animal cognition.
2. Animals can sort stimuli into polymorphous categories—that is, cat-
egories that are not defined by a single feature (e.g., people, cats,
cars, flowers, and chairs). Such categorization probably depends on the
basic learning mechanisms discussed in Chapters 4 and 5. The learning
process may (a) find the most predictive features and associate them
with each category, (b) construct a prototype of each category, or (c)
allow the animal to remember each example and respond to new stimuli
according to how similar they are. The Rescorla-Wagner model and its
successors accomplish (a) and (b), and the Pearce configural learning
model accomplishes (c).
3. Organisms generalize from one stimulus to another depending on their
physical similarity. We generalize between two stimuli depending on the
number of common features (elements) that they share.
4. Generalization is also affected by learning:
(a) Generalization gradients are sharpened by discrimination training, in
part because the training introduces inhibition to S–. The presence of in-
hibition to S– can produce surprising peaks in responding to other stimuli.
For example, in peak shift, the highest level of responding occurs in the
presence of a stimulus that has never actually been reinforced.
(b) Generalization between two stimuli will increase if they are associ-
ated with a common event or stimulus. Such mediated generalization al-
lows new, superordinate categories (e.g., furniture, clothing) to be built
without physical similarity between the individual stimuli.
(c) Mere exposure to similar stimuli (like various birds or types of wine)
can make them easier to discriminate. Such perceptual learning may
occur because exposure to similar stimuli latently inhibits their common
elements, creates inhibition between their unique elements, and/or cre-
ates a unitized representation of each stimulus.
360  Chapter 8

5. For a stimulus to guide instrumental action, it must be perceived, at-


tended to, and remembered.
6. Attention requires that the stimulus be informative. Attention can also
be boosted by attentional priming.
7. Working memory allows stimuli or events to guide behavior after they
are gone. It has been studied in the delayed matching-to-sample meth-
od and in the radial maze. Working memory is influenced by practice
and by retroactive and proactive interference. Animals appear to use
working memory actively and efficiently, as when they switch between
retrospective codes (remembering events that have come before) and
prospective codes (remembering events that are coming in the future).
8. There are several different types of human long-term or reference
memory. They can be difficult to study in animals, although there is evi-
dence that scrub jays and other animals have an episodic-like memory
that incorporates what, when, and where information about food items
that they have encountered or stored.
9. Time can be an important guide to instrumental action. Animals (and
humans) are sensitive to time of day cues, which appear to depend on
a circadian clock with a period of about a day. Organisms are also good
at timing intervals on the order of minutes and seconds. Timing may be
accomplished with an “internal clock” that may involve a pacemaker
and an accumulator or the readout of a set of oscillators whose states
change at different rates. Timing might also be accomplished with
mechanisms that do not require a “clock.”
10. Spatial cues also guide instrumental action, and organisms use bea-
cons, landmarks, and dead reckoning to get around in space. Organ-
isms might also form a geometric representation of the environment.
Although spatial learning might be accomplished by specialized learn-
ing mechanisms, research suggests that it depends at least in part on
familiar learning principles that allow animals to associate beacons and
landmarks with goals.
11. Rhesus monkeys have shown evidence of metacognition, an ability to
monitor their own cognitive states: They can learn to make responses to
opt out of situations in which their memory might be weak or when they
are “uncertain” of the correct response. Familiar learning principles may
contribute here as well.

Discussion Questions
1. Discuss the methods that have been used to study categorization in
pigeons. Identify the discriminative stimulus, operant response, and
reinforcer. Why are transfer tests so important in this research? How well
do you think that these experiments capture categorization as it occurs
in humans?
How Stimuli Guide Instrumental Action   361

2. Discuss the key results that indicate that generalization is influenced by


learning. Then take one of those results and discuss how it could affect
the generalization we might see in a study of categorization, spatial
learning, or anxiety in a person with a learned (conditioned) phobia.
3. Distinguish between working memory, reference memory, and episodic-
like memory in animals. How have investigators studied them?
4. Use the pacemaker–accumulator model to explain why a pigeon earn-
ing food on a Fixed Interval 60-second reinforcement schedule will
show a peak of responding every 60 seconds. How would the multiple-
oscillator model explain the same thing?
5. It has been argued that simple associative learning principles (like those
emphasized in theories of classical conditioning) can go some distance
in explaining examples of animal cognition like categorization, percep-
tual learning, spatial learning, and even metacognition. Discuss the
evidence. How far do you think that such principles can go in helping
understand human cognition?

Key Terms
acquired geons  321 prototype  304
equivalence  317 inhibition  308 prototype theory  305
animal cognition  295 internal clock  340 radial maze  328
attentional priming  326 landmark  344 reference memory  330
beacon  343 matching-to-sample  318 retrospective code  330
behavioral theory of mediated scalar property  338
timing  342 generalization  318 search image  325
categorization  298 metacognition  356 semantic memory  332
circadian rhythm  335 multiple oscillator stimulus generalization
conditional model  341 gradient  306
discrimination  318 multiple-time-scale superposition  338
dead reckoning  343 model  342 temporal bisection  338
declarative peak procedure  337 temporal
memory  332 peak shift  311 generalization  336
delayed matching-to- perceptual transfer tests  301
sample (DMTS)  327 learning  314 transposition  312
episodic memory  332 place cells  346 water maze  347
exemplar theory  305 procedural working memory  326
feature theory  302 memory  332
geometric module  346 prospective code  330
Chapter Outline
How Motivational States Affect Motivation by expectancies  387
Behavior 364 General and specific outcome
expectancies 391
Motivation versus learning  364
What does it all mean?  394
Does Drive merely energize?  366
Is motivated behavior a response to Dynamic Effects of Motivating
need? 371
Stimuli 396
Anticipating Reward and Opponent-process theory  396
Punishment 376 Emotions in social attachment  399
A further look at addiction  401
Bait and switch  376
Conclusion 404
The Hullian response: Incentive
motivation 379 Summary 405
Frustration 380
Another paradoxical reward effect  382 Discussion Questions  407
Partial reinforcement and
persistence 384 Key Terms  408
chapter

1
9
The Motivation of
Instrumental Action

O ne7 anddetail8 isthat has been all but missing from Chapters
the idea that instrumental behavior is moti-
vated. To many people, the importance of motivation
is self-evident. Emotions and motives are important
parts of our lives, and the purpose of instrumental ac-
tion is obviously to satisfy them. Behavior seems in-
herently goal-oriented or organized around achieving
desired outcomes. That is what motivation is all about.
At least two formal properties of behavior sug-
gest that it is motivated. For one thing, behavior is
variable. In the presence of the same discriminative
stimuli (see Chapter 8) and the same history of rein-
forcement (see Chapter 7), a rat might press the lever
faster on Tuesday than on Monday. I might likewise
ride my bike home a bit more quickly that day. What
is the difference? Maybe the rat was hungrier on
Tuesday. Maybe I was, too; I usually go home think-
ing about dinner. Or maybe I wanted to get home
quickly to watch my daughter’s Tuesday soccer game.
The point is that different levels of motivation can help
explain variability in behavior that is not explained by
knowing only the prior contingencies of reinforcement
and the stimuli that set the occasion for the behavior.
364  Chapter 9

Another property of behavior is its persistence. Often, behavior just


keeps on going until it is satisfied. Variability and persistence are evident
when the dog chases a rabbit—with one whiff of the rabbit, the chase be-
gins, taking the dog over hills, through bushes, and across streams until
the rabbit is caught. While biking home, you might take a long detour to
avoid a traffic jam or some dusty road construction, but you keep mov-
ing until you get home. Variability and persistence are common features
of instrumental action, and they both suggest that behavior is motivated.
These ideas were well appreciated by Edward Tolman. You will remem-
ber (see Chapter 7) that he saw instrumental behavior as fundamentally
goal-oriented—a variable means to a fixed end. Based on experiments like
the latent learning experiment (Tolman & Honzik, 1930), Tolman argued that
reinforcers were not really necessary for learning, but were instead essential
for motivating behavior and giving it a purpose. He separated learning from
performance. Rats that dawdle in the maze without reward may actually be
learning about the maze. The rats merely have no reason to show what they
know until a reward gives them a reason to perform efficiently. The distinc-
tion between learning and performance is another thing that motivation is
all about. You can have knowledge on the one hand and behavior on the
other, but motivation is the thing that often translates knowledge into action.
This chapter continues our discussion of instrumental behavior by
considering some of its motivational aspects. As we will see, motivational
principles are indeed relevant for a complete understanding of instrumen-
tal action. In addition, what you already know about learning turns out
to be quite crucial in understanding motivational processes themselves.

How Motivational States Affect Behavior


Motivation versus learning
Laboratory studies of motivation began in the 1920s and 1930s, when sev-
eral kinds of experiments convinced everyone that motivation is impor-
tant. For one thing, rats that were deprived of food appeared to be more
active in mazes and running wheels than nondeprived rats (e.g., Richter,
1927). Other experiments indicated that rats that were deprived of food or
water—or even sex—would walk across an electrified grid floor to get to
the corresponding goal object (Warden, 1931). The number of floor cross-
ings the rats would tolerate was also a lawful function of the degree of
deprivation—so motivation could be quantified. Still other experiments
suggested that humans and other animals that were deprived of specific
nutrients had specific hungers that motivated them to eat more of these
missing nutrients. For example, Curt Richter told of a boy who was admit-
ted to the Johns Hopkins hospital with an amazing appetite for salt—he
would add salt to all his food. The boy was probably born without adrenal
glands, which are necessary for producing a hormone (aldosterone) that
retains salt in the kidneys. He was therefore in a continual state of salt
need. Indeed, rats whose adrenal glands were surgically removed also
The Motivation of Instrumental Action   365

demonstrated great persistence in working for salt (Richter, 1936). Other


experiments seemed to suggest that rats (Richter, Holt, & Barelare, 1938),
children (e.g., Davis, 1928), and other animals could select a healthy diet
for themselves when offered a cafeteria of relatively purified foods. It was
as if behavior was organized quite naturally and automatically to detect
specific needs and then go about satisfying them.
All this work was consistent with the concept of homeostasis. The idea
is that the body defends an equilibrium, and when there is some movement
away from equilibrium, the body goes about correcting for it. Thus, when
the body is depleted of something (such as food, water, or salt), it goes
about repleting itself. This pattern turns out to be an important function
of behavior—that is, it helps the body defend an equilibrium.
Homeostasis was one of the main ideas behind Clark Hull’s influential
theory of learning and motivation (Hull, 1943). We already talked a bit
about Hull’s theory in the context of theories of reinforcement; his idea
was that reinforcers function to reduce Drive. In fact, Hull was extremely
systematic and specific about what he meant by Drive and how it was
supposed to influence behavior. Drive was thought to be a very general
motivational state that was activated whenever the body was in need. Hull
described it in a very systematic and quantitative way. The core idea was
Clark Hull
that Drive (D), the major motivator, multiplied by Habit (H)—which was
Hull’s term for “learning”—produces behavior strength:

Behavior strength = D × H

Notice that Hull gave motivation and learn-


ing equal importance. If either Drive or Habit
were equal to zero, there would be no behav-
ior at all. The idea that D and H multiplied 70
each other was consistent with the results of 23 hours food deprivation
60
Resistance to extinction

experiments in which rats, given different


numbers of reinforcers for lever pressing, 50
were tested for lever pressing at two levels 40
of food deprivation (Figure 9.1; Perin, 1942;
30
Williams, 1938). The number of reinforcers 3 hours food deprivation
influenced Habit, and the amount of food 20
deprivation determined Drive. Both clearly 10
had a powerful effect on performance, and
the two did not interact in the sense that the 0 10 20 30 40 50 60 70 80 90
effect of Habit was the same regardless of the Number of reinforcements
level of Drive. (In Figure 9.1, the two lines can
Figure 9.1  Resistance to extinction is in-
be described by equations of the same form.) dependently affected by both the amount
Habit and Drive were thus independent and of food deprivation (which affects Drive, or
appeared to multiply. D) and the number of prior reinforcements
Hull’s theory was successful at describing (which affects Habit, or H). (Data from Perin,
the state of knowledge about motivation and 1942, and Williams, 1938; after Bolles, 1975.)
366  Chapter 9

behavior at the time. It was also beautifully specific and testable. For these
reasons, it guided research in learning and motivation for many years,
and in doing so, it advanced our understanding appreciably. The theory
is consistent with the animal-in-the-forest fable described in Chapter 7. Re-
member that the animal that wakes up hungry in the morning is motivated
by its need for food, so it runs around blindly and restlessly and therefore
happens on an instrumental action that leads to food and is therefore rein-
forced by need reduction. The next time that the animal is hungry, the drive
will energize this new habit. The theory described an adaptive system in
which behavior was organized to compensate and respond to need. There
were several specific factors involved. First, Drive was caused by need. The
animal was only motivated by Drive when it actually needed something in
a biological kind of way. Second, Drive was thought to energize (1) consum-
matory behavior, like eating or drinking; (2) general activity, consistent with
the animal’s restless performance in the fable (and in Richter, 1927); and (3)
instrumental behavior, like lever pressing or runway running, which are
examples of goal-directed action. These ideas are sensible, and they seem
consistent with most intuitions about how a motivational state like hunger
motivates behavior. Let us see how accurate they are.
Does Drive merely energize?
Unfortunately, the theory actually only worked up to a point. It is true that
need increases the strength of consummatory activities. For example, the
more food-deprived rats are, the quicker they begin eating (Bolles, 1962,
1965). But things got strange when investigators studied the effects of Drive
on general activity. A lot of the increased activity shown by food-deprived
rats was in response to changes in stimulation, including signals of food
(Campbell & Sheffield, 1953; Sheffield & Campbell, 1954). And the effects
of food deprivation itself depended on the species being tested and on
the method used for measuring “activity” (e.g., Campbell, 1964; Camp-
bell, Smith, Misanin, & Jaynes, 1966). The picture that emerged was very
messy—there seemed to be nothing very general about general activity.
Part of the reason for the divergence between species and activity mea-
surements was probably that hunger and thirst tend to increase behaviors
that are naturally related to finding food or water in different species.
Different devices for measuring activity could differ in their sensitivity to
different types of movement. Sara Shettleworth (1975) used careful obser-
vational methods to examine what hamsters do when they are deprived
of food. She found that hungry hamsters showed higher levels of scrab-
bling (pawing the walls) and open rearing—and lower levels of scratching,
grooming, face washing, and scent marking (Figure 9.2)—than satiated
(not hungry) hamsters. A similar pattern emerged in anticipation of daily
feedings, and you may remember that another similar pattern also emerged
when Shettleworth reinforced these behaviors with food (see Chapter 2,
Figure 2.20). The overall picture suggests that motivational states increase
functional sets of behavior that are organized to deal with such states. This
The Motivation of Instrumental Action   367

Eat Hoard Drink Urinate Yawn Shake In nest


10 and and
defecate stretch
5
0

Scratch Groom Wash Scent Freeze Manipulate Pick up


15 sides face mark newspaper sawdust
Level of action pattern

10
5
0
HC OF
Scrabble Open Gnaw Wall Dig
25 Hungry
rear rear
Not hungry
20 HC Home cage
OF Open field
15
10
Walk/
5 sniff
0
HC OF HC OF HC OF HC OF HC OF HC OF
Environment

Figure 9.2  Effects of food deprivation (hunger) on various behaviors in the


golden hamster. Hunger does not blindly energize activity, but selects be-
haviors from an organized behavior system. (After Shettleworth, 1975.)

may sound familiar because it is a premise of behavior systems theory (see


Chapter 5); one example of a behavior system was shown in Figure 5.17
(Timberlake, 1994). The idea here is that motivational states select specific
sets of behaviors rather than energizing everything blindly.
Other problems with Drive appeared when investigators started asking
whether instrumental action is a simple function of deprivation level (see
Bolles, 1975, for one review). To some extent, it certainly is: When rats are
deprived of food or water, their operant lever pressing increases while they
work for the corresponding reinforcer (Collier & Levitsky, 1967; Collier,
Levitsky, & Squibb, 1967). One of those experiments is illustrated in Figure
9.3. Things got less clear, however, when the operant behavior became

300
Figure 9.3  Lever pressing
Bar presses

200 for food pellets as a func-


tion of the amount of body
weight loss caused by different
amounts of food deprivation.
100
1 5 10 20 30 40 (Data from Collier et al., 1967;
Body weight loss (%) after Collier & Johnson, 1997.)
368  Chapter 9

more difficult, as in complex discriminations (Bolles, 1975). And, even in


simple operant situations in which an animal performs an instrumental
act to get a reinforcer, the relationship between instrumental behavior and
motivational state is now known to be much more interesting than Hull
had initially thought.
Tony Dickinson and Bernard Balleine have argued that the effects of
motivational state on instrumental behavior depend crucially on the ani-
mal’s knowledge of how the reinforcer actually affects the motivational
state (e.g., Balleine, 2001; Dickinson, 1989; Dickinson & Balleine, 1994).
This type of knowledge depends on a process that Dickinson and Balleine
call incentive learning, which occurs when the animal ingests a reinforcer
while it is in a particular motivational state. For example, when a hungry
rat eats a new food, it may learn that the food is satisfying while it is hun-
gry. Similarly, when a thirsty rat drinks a new fluid, it can learn that the
stuff makes it feel good when it is thirsty. Once this kind of knowledge
about an incentive is learned, it can be combined with knowledge about
what instrumental actions lead to it. In this way, a motivational state like
hunger will invigorate an instrumental act like lever pressing if (and only
if) the animal knows that lever pressing produces an incentive that makes
the rat feel better in the hunger state.
An experiment by Balleine (1992) illustrates this idea. Balleine first
taught rats to press a lever for a food pellet reward while the rats were
satiated. The rats had never had these food pellets before. In a crucial
test, some of the rats were returned to the Skinner box while they were
now hungry and were allowed to lever press again. Unlike the experiment
shown in Figure 9.3, the reinforcer was not available during the test—that
is, lever pressing was tested in extinction. What happened was rather em-
barrassing for Hull: Rats tested for the first time while they were hungry
Bernard Balleine
showed no effect of hunger on performance (at left in Figure 9.4A). Hull
got it wrong here; he would have predicted that the new D would have
automatically multiplied H to produce stronger responding in the tested-
while-hungry group. (It is worth noting that moving a flap on the front of
the food cup—a behavior that was more proximate to the reinforcer—did
increase; see Balleine, Garner, Gonzalez, & Dickinson, 1995.)
Balleine’s experiment also included another pair of groups, and those
are the ones that make the experiment really interesting. As before, the
rats were taught to press a lever for food pellets while they were satiated,
and then half were ultimately tested for lever pressing in extinction while
they were hungry. But these rats also had an opportunity for incentive
learning at the start of the experiment. That is, before the instrumental
learning phase began, these rats were allowed to eat the pellets while
they were actually food-deprived and hungry. The idea, again, is that
this experience could allow the rats to learn that the pellets were “good”
while they were in the hunger state. (The other pair of groups received
a similar exposure to hunger, but there was no opportunity for incen-
The Motivation of Instrumental Action   369

(A) Train low hunger (B) Train high hunger

No previous Previous food No previous Previous food


200 160
food pellets pellets while food pellets pellets while
while hungry hungry while satiated satiated
160

Mean total responses


Mean total responses

120

120
80
80

40
40

0 0
Low High Low High High Low High Low
Hunger during test Hunger during test

Figure 9.4  Incentive learning: The effect of hunger on instrumental behavior


depends on what the organism has learned about the reinforcer while in the
hunger state. (A) Rats were trained to press a lever for pellets while hunger
was Low. They were then tested in extinction with either Low or High levels of
hunger. High hunger motivated (increased) lever pressing only if the rats had
previously eaten the pellets while in the hungry state—and therefore learned
about the pellets’ effects on hunger. (B) Results of a similar experiment in which
rats were trained while hunger was High and then tested in extinction with
either High or Low levels of hunger. Here, the Low level of hunger during the
test demotivated behavior only if the rats had previously eaten pellets in the
nonhungry state. (After Balleine, 1992.)

tive learning to occur.) This time, when the rats were made hungry and
tested in extinction, hunger had a strong effect on lever pressing (at right
in Figure 9.4A). Thus, hunger does invigorate lever pressing, but only if
the rat knows that the outcome of responding is a good thing while in
the hunger state. In other experiments, Balleine (1992) also found that
downshifts in hunger decreased lever pressing only if the rats had the
opportunity to learn about the effects of food pellets while in the sati-
ated state. The effects of either increases in hunger (see Figure 9.4A) or
decreases in hunger (Figure 9.4B) on instrumental behavior depended
crucially on the rats’ knowledge of the effect of the pellets while in the
tested motivational state.
Similar results have been shown with thirst and liquid reinforcers
(Dickinson & Dawson, 1989; Lopez, Balleine, & Dickinson, 1992). Interest-
ingly, other evidence further suggests that baby rats do not instinctively
drink water when they are dehydrated or eat rat chow when they are
370  Chapter 9

deprived of food. Instead, they first need a chance to consume and learn
about the effects of water and chow while they are either thirsty or hungry
(Changizi, McGehee, & Hall, 2002; Hall, Arnold, & Myers, 2000; Myers &
Hall, 2001). This natural incentive learning apparently happens when the
growing pup first samples water and chow in the week or two that fol-
lows weaning from the mother’s milk. There is also evidence that animals
need to learn about the motivational effects of feeling warm. Hendersen
and Graham (1979) first trained adult rats to avoid heat from a heat lamp,
which was unpleasant in a warm (27°C) environment. Then they tested
the rats in extinction when the room was either warm (27°C) or cold
(7°C). The rats continued to avoid the heat regardless of the current room
temperature. But before the test, some rats had a chance to experience
the heat lamp while the environment was cold. Doing so presumably
allowed them to learn that the lamp felt good, not bad, when they were
cold. Indeed, when these rats were tested in the cold environment, they
did not avoid the heat lamp anymore. In fact, they delayed responding
longer, as if waiting for the heat to come on. Thus, incentive learning
seems to be important in several motivational systems. For a motivational
state to influence instrumental performance, we need to learn about the
value of the reinforcer in that state.
Dickinson (1989) and Dickinson and Balleine (1994) noted that their
emphasis on incentive learning overlaps with some of Tolman’s earlier
thinking about instrumental action (Tolman, 1949). Tolman emphasized
the importance of “cathexes” in motivated behavior, by which he meant
the connection between a goal and its corresponding motivational state,
which he thought was learned during consumption of the goal object.
How this learning actually operates is still not completely understood.
Garcia (1989) and others have emphasized how foods become connected
with their effects on the digestive system; you may remember ideas like
the “hedonic shift” from Chapter 6. Cabanac (1971) demonstrated that
people rate reinforcers (such as heat and sweet flavors) as especially pleas-
ant when they experience these reinforcers in the motivational state they
satisfy, and learning about one’s own hedonic reaction to the outcome may
be involved in incentive learning (see also Chapter 10). Consistent with this
idea, foods and solutions consumed while animals are in a corresponding
state of deprivation become preferred or more reinforcing later (Capaldi,
Davidson, & Myers, 1981; Revusky, 1967, 1968). Terry Davidson (1993,
1998) has suggested that motivational states like hunger set the occasion
for relations between stimuli, like the flavor of a food and its ingestional
consequences. Through experience, we learn that hunger signals that a
flavor in turn signals something good. Perhaps this mechanism is involved
in learning about incentive value. An interesting discussion of some of
these views, as well as Tolman’s own view and its limitations, can be found
in Dickinson and Balleine (1994; see also Balleine, 2001; Dickinson, 1989;
Dickinson & Balleine, 2002).
The Motivation of Instrumental Action   371

To summarize, motivational states do affect instrumental behavior,


but their effects are more subtle and interesting than Hull imagined in
the 1940s and 1950s. First, states like hunger do not blindly energize all
behaviors, but instead seem to select specific sets of behaviors or behavior
systems that evolution has organized to deal with the goal. Second, for a
motivational state to energize an instrumental action, the organism must
first learn that the action leads to a particular reinforcer and that this re-
inforcer has a positive effect on the motivational state. The motivational
state thus influences instrumental behavior by increasing the desirability
of particular reinforcers, and the animal will then perform those behaviors
that lead to these reinforcers. This kind of view is consistent with a more
cognitive view of instrumental learning and will be considered further in
Chapter 10. Hull was an S-R theorist, and as we will also see in later parts
of this chapter, his ideas about learning—like his ideas about Drive—are
now a bit out of date.
Is motivated behavior a response to need?
One reason that the animal-in-the-forest fable is so relevant is that it also
emphasizes another important assumption present in early thinking about
motivation. The animal in the forest wakes up in a state of need, and ev-
erything that follows is basically a response to this need. In Hull’s system,
Drive was a response to need—the state of depletion motivates behaviors that
cause repletion. Hull’s thinking about motivation and Drive, often called
a depletion-repletion theory (e.g., Collier & Johnson, 1997), seems to come
naturally to people thinking about motivation and hunger.
Although it makes evolutionary sense for animals to have a system that
can respond to depletion, it is worth wondering whether it makes as much
sense to have an animal wait until it is in a state of need to begin seeking
reinforcers, like food or water. If an animal always waits until it needs
food before going about finding it, the animal could be in trouble if food
suddenly becomes truly scarce. Instead of waiting to need food, a better
system might have the animal acquiring food before there is really a need
for it. In this sense, we might expect to see motivated behavior organized to
anticipate needs rather than respond to them. Responses to depletion must
be available, of course, but they may be designed to deal with emergencies
rather than the normal everyday state of affairs.
There is actually a great deal of evidence that motivated behavior is
organized in anticipation of—rather than in response to—need. One of
my favorite experiments illustrates the point beautifully. Fitzsimons and
LeMagnen (1969) were studying drinking in rats. Half of the rats being
studied were maintained on a high-protein diet, which requires a great
deal of water for digestion; the other half were maintained on a lower-
protein diet, which requires less water. Not surprisingly, rats in the lower-
protein group drank less water every day. In a second phase, however,
the low-protein rats were switched to the high-protein diet. What they
372  Chapter 9

did was very revealing. At first, they responded to the new demand for
water after each meal. But then, within a few days, the extra drinking
moved forward in time so that the rats began drinking extra water with
each meal. Thus, the extra drinking began to be done in anticipation of
the need rather than in response to it. When rats on the high-protein diet
were switched to the lower-protein diet, they continued to drink water
with their meals, although the amount of water they drank gradually
decreased. The study suggests that although the rat can adjust its water
intake so that it responds to need, drinking occurs to prevent need rather
than to escape it.
Another ingenious experiment made a similar point about eating (Le-
Magnen, 1959, described in Seeley, Ramsay, & Woods, 1997). This time,
LeMagnen gave rats three 1-hour opportunities to eat food (Meals 1, 2,
and 3) each day. After the rats had become accustomed to this schedule,
LeMagnen withdrew Meal 2. Initially, the rats compensated by increas-
ing their consumption during Meal 3—that is, in response to the new
need generated by missing Meal 2. But eventually, the rats began to eat
the extra food during Meal 1, which LeMagnen attributed to learning.
In effect, although eating initially responded to need, in the long run it
seems to be influenced by learning and organized to prevent—rather
than respond to—depletion.
The idea that eating occurs in response to need suggests that animals
might wait until the tank is empty before they fill it. In reality, if one stops
and checks the fullness of the lab rat’s stomach under ordinary free-feeding
conditions, it is never empty (Collier, Hirsch, & Kanarek, 1977). When food
is freely available, the typical rat eats 9 to 12 meals a day, mostly at night
and around the time that the lights switch on or off on the day-night light
schedule. Various attempts have been made to predict the size of each
meal based on the amount of time that has passed since the last meal. If
the animal is repleting some depletion, it should consume a bigger meal
the longer the time since the last meal. Often, there is little correlation
between the intermeal interval and the size of the next meal (Collier et
al., 1977), and the generality of the finding when it occurs is controversial
(Collier & Johnson, 1997). Woods and Strubbe (1994) emphasized that a
great deal of the rat’s food is eaten when the lights in the colony room
go on or off at the beginning or end of each day (presumably, regardless
of the time since the last meal). Woods and Strubbe note that meals are
actually disruptive events because they put so many difficult metabolic
demands on the animal’s body (see also Woods, 1991). Pavlovian cues that
predict meals can actually help the animal adapt because the cues elicit
CRs that compensate for the upcoming meal (see Chapter 2). By eating
food at the time of salient events, like lights-on and lights-off, the system
takes advantage of Pavlovian CRs that can occur in response to these cues.
Once again, learning factors—rather than merely need factors—may play a
powerful role in controlling motivated behavior. Learning is now thought
The Motivation of Instrumental Action   373

Figure 9.5  Meal frequency, meal size, and total


Total intake (g)
Number of meals (daily) intake as a function of the cost of procur-
ing access to food. The data are hypothetical
but represent results that have been obtained in

Grams
Meal frequency
studies of a number of different species. (After
Collier & Johnson, 1997.)
Meal size (g)

Log procurement price

to be very important in the control of feeding and food regulation (e.g.,


Seeley et al., 1997).
For many years, George Collier studied several species of animals using
a method in which the animals live in Skinner boxes, where they earn all
their daily food (Collier, 1981; Collier & Johnson, 1997; Collier et al., 1977).
In a typical experiment, Collier and his colleagues manipulated the cost of
meals by requiring the animals to make a certain number of lever presses
to get the food. The animals determined the size of each meal because
once a meal was earned, the food bin stayed available until they stopped
eating. In this way, the animals could control the number of meals—and
the size of each meal—each day. Collier found that regardless of species,
daily meal size and meal number are lawfully related to meal cost. As the
cost increases, the animals take fewer—but bigger—meals in a way that
keeps their total daily intake more or less constant (Figure 9.5). At any
given meal cost, though, meal size, meal number, and intermeal intervals
are still variable, and time since the last meal is a poor predictor of the size
of the next meal—or the rate at which the animals consume it. In the long
run, Collier and Johnson (for example) concluded that “meal patterns are
tools in the service of the optimization of expenditures of time and effort
to exploit resources. They do not reflect the cyclic processes of depletion
and repletion” (Collier & Johnson, 1997, p. 126). Something other than the
momentary state of need explains feeding.
Leanne Birch and her colleagues studied eating in young children, and
they came to a complementary conclusion (e.g., Birch & Fisher, 1996). New-
borns start out as depletion-repletion feeders who cry when they need food
and then get fed. But not much later in development, other learning pro-
cesses begin to take over. By about 6 months of age, babies learn that cues
in the evening predict a period of fasting overnight, and they begin to take
larger meals in anticipation of the fast. By preschool age, environmental
and social cues begin to control the initiation of meals. For example, in one
study (Birch, 1991) children played repeatedly in two rooms. In one room,
food was freely available, and in another room, food was never available.
In a final test, the kids were first given a large dish of ice cream, and then
they were allowed to play in these rooms with snacks freely available.
374  Chapter 9

Figure 9.6  Children who are 250 600


otherwise “full” eat more snack
food (left) and take less time 500
to begin eating it (right) when 200

Ad lib consumption (Kcals)


they are tested in a room that
400
has been previously associated
150

Latency (s)
with food (CS+) as opposed to a
room not previously associated 300
with food. (After Birch, 1991.) 100
200

50
100

0 0
CS+ CS – CS+ CS –
Test condition

Despite being full, the children consumed more snack food—and they
were quicker to begin eating it—in the room that had been associated
with food (Figure 9.6) than in the other room. This finding is similar to a
remarkable phenomenon known to occur in rats. On a number of trials,
Harvey Weingarten (1983, 1984) gave rats a combination light and buzzer
CS before delivery of evaporated milk in the rats’ home cage. Later, when
the light and buzzer were presented, even satiated rats got up within 5
seconds and ate about 20% of their ordinary daily intake! (See also Bog-
giano, Dorsey, Thomas, & Murdaugh, 2009; Reppucci & Petrovich, 2012;
Weingarten, 1990.) This phenomenon seems familiar to me. I tend to eat at
noon or around 7:00 pm because temporal cues associated with previous
meals basically tell me (motivate me?) to eat again.
Children’s food intake is also strongly influenced by their food prefer-
ences, which are learned through their repeated experiences with foods
and eating as they mature (Birch & Fisher, 1996). For example, in several
experiments, children acquired preferences for flavors that had been mixed
with maltodextrin (a starch; Birch, McPhee, Steinberg, & Sullivan, 1990) or
high-fat concentrations of yogurt (Johnson, McPhee, & Birch, 1991; Kern,
McPhee, Fisher, Johnson, & Birch, 1993). The results are reminiscent of other
research in rats, discussed in Chapter 2, that indicate that animals learn to
prefer flavors associated with a variety of nutrients (e.g., Sclafani, 1997).
Children can also be taught to pay attention to both internal and external
cues. Four-year-olds who were taught to pay attention to their internal full-
ness (by filling the glass stomach of a doll, talking with an adult about how
you know when you are full, etc.) were less likely to eat additional snacks
after eating yogurt than were others who had been taught to pay attention
to external cues such as eating at the sound of a bell (Birch, McPhee, Shoba,
Steinberg, & Krehbiel, 1987). Learning clearly plays an important role in
eating and food selection.
The Motivation of Instrumental Action   375

A final comment on specific hungers—one of the early phenomena


that stimulated interest in the depletion-repletion model—is in order. One
review has raised doubt about the idea that animals can select healthy
diets when given various foods in a cafeteria arrangement (Galef, 1991).
Many of these “automatic” adaptations to specific needs are not auto-
matic at all—that is, when specific hungers seem evident in behavior, they
often depend on learning. For example, Rozin and Kalat (1971) reviewed
a number of experiments from the late 1960s that had been conducted
on specific hungers for nutrients like thiamine. Typically, rats are made
deficient in thiamine by giving them a diet that is lacking this nutrient. As
a consequence, after a few weeks, the rats get scruffy and sick. If the rats
are then offered a new diet that is rich in thiamine, they will choose it, but
preference for the new diet is blind. For example, the rats choose the new
diet regardless of whether the new diet or the old diet now has thiamine
in it (Figure 9.7A; Rozin & Rodgers, 1967). In addition, when given a
choice between two new diets—one that contains thiamine and one that
does not—the rats go to either diet indiscriminately (Figure 9.7B; Rodg-
ers, 1967). When Rozin (1967) watched the behavior of rats while they ate
the original thiamine-deficient diet, they ate little of it, but also tended to
paw it and spill it. In the end, the main explanation of the phenomenon
was that the rats learned an aversion to the diet that made them sick, and
they subsequently avoided it. Except perhaps for sodium deficiency—and
possibly phosphorus deficiency—there is no behavioral response that au-
tomatically comes into play to satisfy a specific nutritional need. Instead,
rats learn aversions to foods that make them sick, and they may ultimately
also learn to prefer foods that make them well. Once again, the relation
between need and motivation is not as tight as originally supposed, and
once again, we find an important role for learning.

(A) Old vs. new diet (B) Two new diets


New has thiamine Old has thiamine “A” has thiamine
91%
88%
Preference (%)

Preference (%)

50% 50%

9% 12%

Old New Old New A B

Figure 9.7  (A) Thiamine-deficient rats prefer a new diet over the old one
regardless of whether the new diet or the old diet has been fortified with
thiamine.
Q: (B) Deficient
I wasn’t certain rats
on the bar alsofor
colors initially
part B.choose indiscriminately between new
Since they both refer
diets—whether thetodiets
new diets, I didn’t
contain want to or
thiamine usenot.
red.The explanation: The rats
have learned an aversion to the old diet (because they got sick while eating it)
and now avoid it. (Data from Rodgers, 1967, and Rozin & Rodgers, 1967.)
376  Chapter 9

To summarize, although animals in need of food or water will perform


behaviors that correct their deficit, responding to depletion is probably
something saved for emergencies. Instead, a great deal of motivated behav-
ior seems to anticipate—rather than respond to—need. Learning thus plays
another very powerful role. Given all this research, we might wonder what
we mean when we say that we are “hungry.” We might describe ourselves
this way either when we need food (e.g., hours after our last meal or after a
strenuous hike) or when we are exposed to cues that have been associated
with food (e.g. meal times or restaurants emanating lovely food aromas).
Hunger motivation seems to be aroused by either biological need or by
signals for food (Weingarten, 1985). It is therefore worth thinking more
about how signals actually motivate.

Anticipating Reward and Punishment


Bait and switch
The idea that signals and expectations motivate behavior was another of
Tolman’s ideas. You might remember that in the latent learning experi-
ment (Tolman & Honzik, 1930), rats that explored a maze without reward
quickly got through the maze efficiently once they were rewarded. The
idea was that knowledge that there was a reward at the end of the tun-
nel, so to speak, gave the rats a reason to get through the maze efficiently.
Acquired motivation is a name given to this kind of motivation because
it is acquired through experience—unlike the motivation thought to be
provided by drives and needs.
Other experiments conducted in Tolman’s lab supported the same kind
of idea. For example, Tinklepaugh (1928) had monkeys learn to open a box
to find a tasty piece of banana inside. On test trials, Tinklepaugh secretly
replaced the banana with a piece of lettuce. Lettuce is a reasonable reward
for a monkey, but it is not as good as banana, and the bait-and-switch trick
had a major effect on the monkey once the lettuce was discovered. Accord-
ing to Tinklepaugh, the monkeys shrieked, looked angry and frustrated,
and appeared to search for the missing banana. They behaved as if they
had expected banana and that the switch to lettuce made them mad. In a
related experiment with rats, Elliott (1928) showed that rats ran quickly
through a maze for a tasty wet-food mash reward. When the reward was
switched to sunflower seeds (which rats like a bit less), the previous be-
havior fell apart. The rats who were switched to sunflower seeds became
even slower to run through the maze than a group of rats that had been
rewarded with sunflower seeds all along. Once again, animals appeared
to expect a certain outcome, and an emotional effect occurred when they
discovered a less preferred commodity.
These motivating effects of rewards were not mentioned in Hull’s origi-
nal theory (1943). In fact, the idea that rewards motivate seems to have been
rather slow to take hold, presumably because the field was so enthralled
with Thorndike’s stamping-in reinforcement mechanism. However, ev-
The Motivation of Instrumental Action   377

Shift

5.0
Preshift Postshift

4.0
256 16
Running speed (feet/s)

3.0 16 16

2.0

1.0
1 16

0 2 4 6 8 10 12 14 16 18 20 2 4 6 8
Trials

Figure 9.8  Running speeds of rats given rewards of different sizes. When
shifted from 256 pellets to 16 pellets, running speed went below the speed of
a group that received 16 pellets all along (“negative contrast”). When shifted
from 1 pellet to 16 pellets, running speed went above (“positive contrast”).
(Data from Crespi, 1942; after Bolles, 1975.)

eryone did finally notice some experiments by Crespi (1942). Crespi also
examined the effects of incentive switches in rats. In one experiment, he
trained rats to run down a 20-foot runway to get 1, 16, or 256 food pellets.
As shown at left in Figure 9.8, running speed was faster the bigger the
reward. In the next phase, all the rats received 16 pellets. The effect on
performance was immediate. The rats that were switched from 256 pel-
lets down to 16 showed a negative contrast effect: Their running speed
abruptly went lower than that of the rats that had been receiving 16 pellets
all along. (This change in number of pellets is like being switched from
banana to lettuce or from mash to sunflower seeds.) Rats switched from 1
pellet to 16 pellets showed a positive contrast effect: Their running speed
immediately increased and overshot the rats that had been receiving 16
pellets all along. It was as if the upshift in reward size made the rats elated
(Crespi called positive and negative contrast “elation” and “depression” ef-
fects, respectively). The idea is that reward shifts can have emotional effects
on behavior. The effects of the 16-pellet reward in the second phase clearly
depended on prior experience—previous exposure to better rewards made
the middle value feel bad (negative contrast), whereas previous exposure to
worse rewards made the middle value feel quite good (positive contrast).
Later studies confirmed and extended these effects. Charles Flaherty
and his students ran a large number of related experiments, and Flaherty
reviewed the literature in a book aptly entitled Incentive Relativity (Flaherty,
378  Chapter 9

Figure 9.9  Successive negative Shift


contrast. Rats were given brief
daily drinks of either a 32% or a 1600
Preshift Postshift
4% sucrose-water solution. When
1400
switched from the 32% to the 4%
solution, they licked less of the 1200

Mean number of licks


4% solution than the group that
was given 4% all along. (After 1000 32 4
Flaherty, 1991.)
800

600

400 4 4
200

0 1 3 5 7 9 11 13 15 17 19
Days

1996). Both positive and negative contrast have been widely demonstrated,
although negative contrast effects are easier to obtain and are more widely
studied than positive contrast effects. In one method, rats are given a daily
5-minute drink of either a very tasty 32% sucrose solution or a tasty (but
less so) 4% sucrose solution. The experimenters usually measure licking of
the solutions; it is not surprising that the rats lick more of the 32% solution
than the 4% solution (Figure 9.9). Then, after several days of drinking the
32% solution, the rats are suddenly given a 4% solution for several days.
As shown in Figure 9.9, there is once again an abrupt shift in performance,
and the 4% solution elicits less licking in the rats that had been previously
drinking the 32% solution. Flaherty and his colleagues tested whether this
example of negative contrast is associated with an emotional effect by
measuring the level of a stress hormone (corticosterone) in the blood after
the 4% solution drinking sessions (Flaherty, Becker, & Pohorecky, 1985).
They also tested the effects of several tranquilizing drugs, like chlordiaz-
epoxide (Librium), on negative contrast (Flaherty, Lombardi, Wrightson,
& Deptula, 1980). An interesting pattern emerged: Remarkably, there was
no emotional effect on the first day of exposure to the new 4% solution;
negative contrast was clearly evident in drinking behavior, but there was
no elevation of corticosterone and no effect of tranquilizers. But emotions
kicked in on the second day after the shift, when plasma corticosterone
levels rose and tranquilizers abolished the contrast effect. On Day 1, the rats
drank less because they detected the downshift and became busy explor-
ing elsewhere in the environment, as if they were searching for the 32%
solution (Flaherty, 1991; Pecoraro, Timberlake, & Tinsley, 1999). Then, on
Day 2 with the less palatable solution, the rats began to get frustrated. An
initial exploratory response was followed by an emotional response. It is
interesting that Tinklepaugh (1928) had also described both types of effects.
The Motivation of Instrumental Action   379

The Hullian response: Incentive motivation


Given experiments like Crespi’s, it became clear by the 1950s that rewards
and their anticipations had motivating effects. In addition to stamping in
or reinforcing behavior, rewards also motivated. To capture the idea, Hull
(1952)—and especially his student Kenneth Spence (1951, 1956), who we
met in Chapter 8 when we considered discrimination learning—began
to emphasize a new motivational construct, incentive motivation. Hull
(1952) added another term in his equation for behavior strength. In addi-
tion to behavior being a function of learning (“Habit”) and need (“Drive”),
behavior was said to depend on the motivating effect of reward. This new
theoretical construct, “Incentive,” was called K in Hull’s equation, perhaps
after Kenneth (“I” had already been taken by several inhibitory factors
that I have not told you about). In equation form, the bottom line was that

Behavior strength = D × H × K

So, if we want to predict the vigor of an instrumental action, we need to


think about two motivational factors in addition to how well the action is
learned. Notice that Incentive (K) was given a status equal to that of Drive
(D) and Habit (H)—if any of three were equal to zero, there would be no
behavior strength. Drive was not learned, but was internal and linked to
need. In contrast, Incentive was learned, externally driven, and linked to re-
ward. Incentive pulled behavior, whereas Drive was said to push behavior.
The reward effects that led to proposing incentive motivation all seemed
to depend on the animal learning to expect a particular reward. This was
clear in Tinklepaugh’s banana-to-lettuce experiment, and it was clear in
Crespi’s experiment as well: It is the change
from the expected value of the reward that
causes positive and negative contrast ef- Start Goal
fects. This kind of description was difficult
to accept in the 1950s because it seemed
mentalistic and nonscientific. Instead, Hull
and his students—especially Spence (e.g., rG RG
1951, 1956)—developed a powerful way to
understand incentive motivation in terms Figure 9.10  How a “fractional anticipatory
of S-R theory. The general idea is known as goal reaction” (rG) causes incentive motivation
the rG -sG mechanism, and it is based on an in the runway, according to Hull and Spence.
earlier paper by Hull (1931). When rewarded with food in the goal area,
The idea can be simplified with a picture the reward elicits a large goal reaction (RG).
This response is associated with stimuli in the
(Figure 9.10). A rat runs down a runway
goal box, which generalize to the start box.
and gets rewarded in a goal box. In Hullian The conditioned response elicited by the cues
theory, one effect of the reward is to rein- in the start box (rG) now provides motivation
force the running response through drive for the rat to run rapidly on the next trial.
reduction, but another thing happens, too. Although the emphasis on S-R learning seems
The food itself elicits a big response—called old-fashioned, the big idea is that Pavlovian
a goal reaction, or RG. For example, the rat conditioning motivates instrumental behavior.
380  Chapter 9

may salivate when it eats the food—exactly as in Pavlov’s experiments. The


crucial thing is that this goal reaction becomes associated with stimuli in
the goal box through classical conditioning. As a consequence, the goal box
itself will elicit a smaller version of the goal reaction, a so-called fractional
anticipatory goal reaction, or rG. (A “little r” is used because CRs are not
quite as big as URs.) There are two important consequences of the condi-
tioning of rG. First, if the start box is similar to the goal box, there will be
some generalization between them, and when the rat is put into the start
box on the next trial, the start box will therefore elicit some rG. This anticipa-
tory reaction provides the incentive motivation—it energizes the running
response. A second consequence is that rG has stimulus effects—that is, the
rat detects the salivation, and this is a stimulus to which responses can be
attached. The stimulus was called sG. The presence of sG provides a stimu-
lus that can persist and allow a series of responses to become attached to
it. The ideas seem a little quaint these days, but in the hands of a brilliant
theorist like Spence, they could be expanded to explain many complex
and sophisticated experimental results. The bottom line, though, is that
the motivating effects of reward were attributed to a mechanism based on classical
conditioning. In this very important sense, incentive motivation involves
all the conditioning laws and processes discussed in Chapters 3 through 5.
The incentive motivation concept is consistent with the results. For
example, consider Crespi’s group of rats that was shifted from 1 pellet to
16 pellets. Habit was learned in Phase 1. Then, on the first trial of Phase
2, there was a bigger RG to the bigger reward; it was quickly conditioned,
generalized to the start box, and the bigger rG then energized the running
response all the more. The same scenario was thought to be involved in the
latent learning experiment (see Figure 7.5) in which animals also received
an upshift from zero to some larger reward. Here, the argument was that
there was some unknown reinforcer that was allowing learning to occur
in Phase 1—perhaps the rats found it reinforcing to be greeted by Honzik
at the end of the maze. Then, as in Crespi’s experiment, there was a big
goal reaction that became conditioned to the maze and then invigorated
the response on the next trial.
Frustration
In fact, rG was just one of several anticipatory responses learned through
classical conditioning that were thought to motivate instrumental action.
Whereas a positive event like food or water was thought to condition posi-
tive goal reactions, an aversive event like electric shock or pain was thought
to condition fear responses, or rE (Mowrer, 1947). This kind of response
was crucial for motivating avoidance learning, as we will see below and in
Chapter 10. And there was another “little r” that was developed by another
important theorist, Abram Amsel (1958, 1962, 1992). When the animal ex-
pects a big reward (which is to say, it has a large conditioned rG) but then
receives a small reward, there is a primary frustration reaction, or RF. The
Abram Amsel size of this frustration reaction is determined by the size of the discrep-
The Motivation of Instrumental Action   381

ancy between the reward expected and the reward obtained. Frustration
is also conditioned through classical conditioning so that another little r,
or rF , becomes elicited by cues associated with decreases in reward. Frus-
tration is unpleasant, evokes withdrawal, and the rat will learn to escape
conditioned stimuli that elicit it (Daly, 1969). It is the emotional part of the
negative contrast effect: Frustration connected with a decrease in the size
of reward becomes conditioned, generalizes to the start box, demotivates,
and elicits competing behavior.
Frustration is a powerful emotion. If you have ever been thwarted in
reaching a goal, you know what it means. A good place to observe frus-
tration reactions is near a soda machine that is temporarily out of order:
People put money in, expecting a cool drink, and when they get nothing
in return, they may rattle the coin return, look a little angry, gesticulate
wildly, and bang on the machine. Amsel and Roussel (1952) reported an
early experiment that convinced people of the importance of frustration.
Rats were run in a double runway in which the goal box of the first runway
was also the start box for a second runway. Amsel and Roussel sometimes
provided a reward in the first goal box and sometimes not. Running speed
in the second runway was faster following nonreward than following re-
ward; the second running response was motivated by frustration (see also
Amsel & Ward, 1965). The thwarted expectation was what counted: Rats
that never received a reward in the first runway—and therefore did not
expect it—did not run as fast in the second runway (Figure 9.11; Wagner,
1959). Getting no reward when one is expected causes a frustration re-
sponse that energizes performance.
Frustration theory helps explain a host of interesting phenomena that
are known as paradoxical reward effects. The word paradoxical refers to
the fact that a reward can sometimes seem to weaken—and a nonreward

3.5

3.0

2.5
Running speed (feet/s)

Figure 9.11  The effect of frustra-


2.0 tion in the double runway. Points
indicate the speed of running in
1.5 the second alley. In the test trials
at the right, speed of running in
1.0 the second alley is especially fast
when an expected reward in Alley
Following reward in A1
Following no reward in A1 1 (A1) did not occur. Control rats
0.5
Never rewarded in A1 that never received a reward in
A1—and therefore never expect-
0 16 32 48 64 80 96 112 ed it—did not run as fast. (After
Trials Wagner, 1959.)
382  Chapter 9

can sometimes seem to strengthen—instrumental action. One example is


the successive negative contrast effect shown in Figures 9.8 and 9.9. Prior
experience with a large reward can make a perfectly good smaller reward
rather ineffective. Another example is the “magnitude of reinforcement
extinction effect.” In instrumental learning, reinforcement with a large
reward can actually lead to faster extinction than reinforcement with a
smaller reward (Hulse, 1958; Wagner, 1961). Still another example is the
“overlearning extinction effect” in which many rewarded trials can para-
doxically increase the rate of extinction, relative to fewer rewarded trials
(e.g., Ison, 1962; Siegel & Wagner, 1963). All these effects are consistent with
frustration theory because large rewards and extensive training will cause a
large amount of frustration when a reward is suddenly omitted at the start
of extinction. Helen Daly and John Daly (1982, 1994) presented a model that
combined the principles of frustration theory with the Rescorla-Wagner
model in order to explain a large number of these effects. The administra-
tion of reward and nonreward can have peculiar outcomes that are not
predicted by the simple law of effect or, for that matter, most theories of
classical conditioning.
Another paradoxical reward effect
A quite separate line of experiments with humans also suggests that re-
wards can have negative effects. Deci (1971) had college students play with
puzzles in the lab. In a crucial phase, participants in an experimental group
were given $1 for every puzzle they finished, whereas a control group was
allowed to continue playing without reward. In a subsequent test without
reward, the previously rewarded group tended to spend less time solving
puzzles than the nonrewarded group. “Extrinsic reward” was said to hurt
“intrinsic motivation” for playing with the puzzles. In another study, Lep-
per, Greene, and Nisbett (1973) studied preschool children who initially
spent a great deal of time drawing with Magic Markers. The children were
divided into three groups: One group received no reward for drawing, and
two groups received a “Good Player Award” certificate with a gold seal and
ribbon. One rewarded group was told that they would receive the award,
and the other rewarded group received it unexpectedly. In a subsequent
test, the group that received the expected reward spent less time drawing
than the other two groups. The unexpected reward had no such effect.
Once again—for the expected reward group, at least—an extrinsic reward
appeared to hurt performance.
The “punished-by-reward” phenomenon and related phenomena that
have been reported since (see Deci, Koestner, & Ryan, 1999) suggested to
some writers that the use of positive rewards can actually hurt human
creativity, productivity, and potential (Figure 9.12; see discussion in Eisen-
berger & Cameron, 1996). In fact, though, the effect is easy to exaggerate.
It would be a stretch to believe that the very high monetary rewards paid
to stars like LeBron James, Jordan Speith, or Jeff Bezos really hurt their ex-
traordinary basketball, golf, or business games. Moreover, the finding is not
The Motivation of Instrumental Action   383

Figure 9.12  Trouble for the future of American sports? (Cartoon © The
New Yorker Collection, 2006, Leo Cullum, from cartoonbank.com. All Rights
Reserved.)

necessarily inconsistent with what we already know about reinforcement.


In Chapter 7, we saw that modern views of reinforcement allow rewards
(contingent activities) to punish instrumental activities that are more pre-
ferred (Premack, 1971a). We have also just considered other paradoxical
reward effects. Large rewards can lead to quicker extinction, and the effect
of a reward in the second phase of a contrast experiment clearly depends on
what reward came before. The point is that we already know that reward
effects are relative; whether a reward has a positive or negative effect can
depend on many factors.
A necessary step in understanding the punished-by-reward phenom-
enon, then, is to figure out what contexts make the phenomenon happen.
Cameron and Pierce (1994) and Eisenberger and Cameron (1996) divided
the experiments that had been published on the phenomenon into several
categories. A statistical analysis of the overall results (a meta-analysis)
then suggested that the harmful effects of extrinsic reward were restricted
to situations in which people were given tangible rewards (e.g., money)
that were announced ahead of time but delivered in a way that was not
dependent on the person’s actual performance. Verbal rewards generally
improved performance. A different meta-analysis by Deci et al. (1999) con-
firmed this result and also found that rewards mostly have harmful effects
when they can be perceived as a means by which someone is trying to
384  Chapter 9

control you (presumably a less preferred state of affairs in the Premack-


ian sense). Yet another meta-analysis (Byron & Khazanchi, 2012) found
that rewards increase creativity if the experimenter is careful to reward
creative performance rather than merely routine performance or simple
completion of the task. In the long run, there seems to be little question
that the skilled use of rewards can therefore increase human performance
in a very large number of instances (Eisenberger & Cameron, 1996). What
matters are factors like expectations and perhaps the interpersonal context
in which rewards occur.
Partial reinforcement and persistence
The most well-known paradoxical reward effect is one that has not been
mentioned yet: the partial reinforcement extinction effect (PREE). This
effect, which may be the most extensively studied phenomenon in all in-
strumental learning, goes something like this. One group of subjects is
continuously reinforced—that is, the subjects are reinforced every time they
run down a runway. Another group is reinforced only 50% of the time—
that is, half of the trials lead to reinforcement (R), and half of the trials
lead to nonreinforcement (N). As shown in Figure 9.13, the continuously
reinforced group shows more performance during the acquisition phase.
When extinction occurs, however, the partially reinforced group is slower
to stop responding. Nonrewarded trials in acquisition can make behavior
more persistent. The effect is widely known in instrumental learning (e.g.,
Sheffield, 1949; Weinstock, 1954) and occurs in classical conditioning, too
(e.g., Gibbs, Latham, & Gormezano, 1978; Haselgrove, Aydin, & Pearce,
2004; Rescorla, 1999b).
The partial reinforcement extinction effect is important for both theo-
retical and practical reasons. On the theoretical side, it suggests that there
is more to learning than habit or associative strength, both of which are
supposed to be some function of the number of reinforced trials. Associa-
tive strength would arguably be weaker in the partially reinforced group

Figure 9.13  The partial reinforce-


CRF
ment extinction effect (PREE).
During acquisition, a continuously PRF
reinforced group (CRF) receives
Running speed

reward on 100% of the trials. In


contrast, a partially reinforced
group (PRF) receives reward on
only 50% of the trials (for exam-
ple). When extinction occurs and
trials are no longer reinforced,
the PRF group is more persis-
tent—that is, it is slower to stop
responding than the CRF group. Acquisition trials Extinction trials
The Motivation of Instrumental Action   385

because they receive half the reinforced trials and yet behavior is more
persistent. The practical implication is that the PREE provides a way to
help make behavior more persistent. If you want to teach your children to
persevere in the face of adversity, the idea would be to expose them to some
partial reinforcement in their lives. In rats, a partially reinforced behavior
is more resistant to punishment (it is less suppressed when it is later paired
with something nasty, like an electric shock) in addition to being more re-
sistant to extinction (Brown & Wagner, 1964). Partial reinforcement of one
behavior can also produce generalized persistence with other behaviors
in both animals (e.g., McCuller, Wong, & Amsel, 1976) and humans (e.g.,
see Nation, Cooney, & Gartrell, 1979). Thus, partial reinforcement may be
a technique that increases general persistence (see Amsel, 1992). Robert
Eisenberger investigated similar issues in a number of experiments with
humans and animals (e.g., Eisenberger, 1992). He argues that rewarding
effort on one task can increase the persistence and effort that we spend on
other tasks—something he calls “learned industriousness.”
Frustration theory provides an interesting explanation of the PREE (e.g.,
Amsel, 1958, 1962). The idea is that a partially reinforced subject experi-
ences some frustration (rF) on the nonrewarded trials. Frustration, like rG,
also has a stimulus effect: sF, which is present on the next reinforced trial,
when the animal is reinforced again. In the long run, the animal is rein-
forced for responding in the presence of frustration. So, when extinction
begins, frustration occurs, but the animal continues to respond (and thus
persists) because that is what it has been reinforced to do.
Frustration theory’s approach to the PREE is powerful and interesting,
but a rival theory known as sequential theory was soon proposed by
John Capaldi (e.g., Capaldi, 1967, 1994; see also Capaldi & Martins, 2010).
Sequential theory is a sophisticated version of the idea that behavior will
persist in extinction as long as the stimulus conditions are similar to those
that were present during acquisition. Extinction is full of many nonre-
warded (N) trials, and partially reinforced subjects have been reinforced
for responding after they have received N trials; in contrast, continuously
reinforced subjects have not. Capaldi argued that the crucial stimulus here
is the animal’s memory of previous trials. During acquisition, a partially
John Capaldi
reinforced subject is reinforced while it remembers recent N trials, whereas
a continuously reinforced subject is always reinforced while remembering
recent reinforcement (R). Once extinction begins, all subjects are asked to
respond on each trial while remembering previous N trials; the partially
reinforced subject responds more than does the continuously reinforced
subject because that is what it has been trained to do. The explanation
sounds a little like that of frustration theory, but sequential theory empha-
sizes memory rather than frustration.
The length of the preceding string of N trials (“N-length”) is also impor-
tant. In extinction, N-length keeps increasing as extinction trials continue.
Eventually, even a partially reinforced subject will stop responding when
the string of recent N trials gets noticeably longer than the one that was
386  Chapter 9

reinforced during training. For example, a rat that receives an RNRNRNR


sequence during training is always reinforced for responding after a single
N trial (an N-length of 1). As extinction continues, the subject is asked to
respond after a longer and longer string of N trials; the test conditions thus
become increasingly different from the training conditions. But consider a
partially reinforced subject that was reinforced for responding every fourth
trial (RNNNRNNNRNNNR). You can think of the animal learning that a
memory with an N-length of 3 is a signal for reward on the next trial (Capal-
di, 1994). Hopefully, you will see that this subject will be even more persistent
in extinction because an N-length of 3 will generalize over a longer string of
extinction trials. According to the theory, the number of different N-lengths
that are reinforced will also increase resistance to extinction.
Sequential theory makes a number of unique predictions that have been
confirmed in the laboratory (e.g., Capaldi & Birmingham, 1998). Several
experiments established that rather subtle variations in the sequence of
rewarded and nonrewarded trials during training can be important. For
example, Capaldi and Capaldi (1970) and Leonard (1969) compared groups
that received different sequences of nonreinforced trials, trials with large
rewards (R), and trials with small rewards (r). One group received repeti-
tions of an rNR sequence during training, and a second group received
repetitions of RNr. The former group received a big reinforcer while re-
membering N, which should have produced stronger reinforcement of
responding in the presence of the memory of N. The second group received
less reinforcement while remembering N. For that reason, the theory pre-
dicted that rNR would lead to more persistence in extinction then RNr,
which is exactly the result that was obtained (Figure 9.14).
Other experiments suggest that it is the memory of R and N that really
counts. For example, in the middle of a string of N trials during acquisi-
tion, Capaldi (1964) and Capaldi and Spivey (1963) occasionally put the
rats in the goal box and gave them food. These feedings were designed to

50

40
Running speed (inches/s)

30
rNR

Figure 9.14  Response speed at the end of ac- 20


quisition (left) and then on extinction trials after
receiving acquisition with rNR or RNr sequences
10
(right). Having the large reward (R) after the N RNr
trial caused more resistance to extinction than
having it before the N trial. The results were 0
3 6 9 12 15 18 21
predicted by sequential theory. (After Leonard, Trials
1969.) Acquisition
The Motivation of Instrumental Action   387

insert a memory of R into a string of Ns, thereby shortening the N-length


that was present when the rat actually ran on the next reinforced trial. The
technique worked; running extinguished more quickly in the subjects that
had received the inserted memories of R in their strings of N than those
that had not. Sequential theory thus made a number of specific and test-
able predictions that were confirmed and that frustration theory was not
ready to handle.
What does all this mean? On the practical side, we now know quite a
bit about how to slow down extinction—and thus maintain persistence in
behavior—with partial reinforcement techniques. On the theoretical side,
we also know quite a bit about why behavioral persistence is not merely a
function of the number of response-reinforcer pairings. It is interesting that
although the PREE was intensively investigated through the mid- to late
1960s, experimental interest in it declined a bit after that, perhaps partly
because it seemed so well understood—existing theories, particularly se-
quential theory, did an amazingly good job explaining the PREE.
It is also worth noting that research on the PREE has further expanded
our understanding of extinction. In our earlier discussion of extinction in
Chapter 5, I emphasized the observation that extinction involves new learn-
ing that appears to be relatively dependent on its context. That is a good way
to understand certain recovery-after-extinction phenomena, such as sponta-
neous recovery and the renewal effect. Research on the PREE, however, sug-
gests that two additional factors are also important. First, sequential theory
emphasizes that responding can stop in extinction when the animal stops
generalizing from acquisition. Thus, extinction can result in part from simple
generalization decrement. Second, frustration theory reminds us that there
are emotional and motivational consequences of occasions when expected
reinforcers do not occur. These motivating effects, along with generalization
decrement, need to be acknowledged in any complete account of extinction
(e.g., Bouton, 2004; Rescorla, 2001; Vurbic & Bouton, 2014).
It is interesting to observe that sequential theory actually has nothing
very motivational in it—subtle extinction effects can be explained in terms
of memory and associative strength, without any appeal to motivational
constructs, like frustration. So, what became of frustration theory? In the
last analysis, frustration seems to be a very real emotion that comes into
play when we are disappointed by a smaller-than-expected reinforcer. But
even though frustration might play a role in the PREE—especially after a
large number of acquisition trials (see Mackintosh, 1974)—it is not always
necessary to talk about frustration. By the late 1960s, an interest in the re-
lationship between learning and motivation was making room for a new
interest in the relationship between learning and memory and information
processing. The cognitive “revolution” in psychology had begun.
Motivation by expectancies
Spence, Mowrer, and Amsel had built a system that emphasized the role of
classically conditioned peripheral responses (rG, rE, and rF) in motivating in-
388  Chapter 9

strumental action. Although these “little r”s could be taken as hypothetical


events, if classical conditioning does provide the motivational background
for instrumental learning, the strength of conditioning measured on a given
trial should then correlate with the strength of the instrumental behavior.
That is, we should be able to predict the vigor of an instrumental action
from the vigor of the Pavlovian motivational response.
By the mid-1960s, however, it was becoming clear that no such cor-
relation was going to be found. Many concurrent measurement studies
were run in which the “little r”s and instrumental behavior were both
monitored at the same time. The correlation between them was not impres-
sive. In appetitive experiments, where animals were performing for a food
or water reward, there was little evidence that the vigor of instrumental
responding was related to the vigor of a Pavlovian response like salivation
(e.g., Ellison & Konorski, 1964; Williams, 1965). The picture was equally
disheartening in avoidance learning, where rats and dogs were trained
to perform instrumental actions to avoid receiving a mild electric shock.
Fear (rE) was supposed to motivate this behavior, and Richard Solomon’s
laboratories at Harvard University (and, subsequently, the University of
Pennsylvania) were devoted to studying its role. There was no correlation
between measures of rE and the vigor of avoidance behavior; for example,
heart rate was not consistently related to the avoidance response in either
acquisition or extinction (e.g., Black, 1958, 1959). Although there is a gross
relationship between Pavlovian responding and instrumental behavior
(animals do salivate in appetitive experiments or show gross heart rate
changes in avoidance experiments), Pavlovian responses are not tightly
coupled with instrumental behavior in a way that suggests that Pavlovian
responses instigate or motivate it.
Findings like these suggested to some that we should discard the idea
that Pavlovian processes motivate instrumental behavior. But to trash
would be a little rash. In an important paper, Rescorla and Solomon (1967)
argued that the problem highlighted by these disappointing correlations
was only a problem if you have an old-fashioned idea about what Pavlov-
ian conditioning is all about. The motivation provided by the Pavlovian
process is not necessarily provided by peripheral responses, which are just
crude indices of a “central state”—like fear or appetitive excitement—that
is learned during classical conditioning (see also Mowrer, 1960). It is the
central state excited by a CS, not some peripheral heart rate or drooling
response, that motivates instrumental action. Although they did not use
the term, Rescorla and Solomon came close to saying that motivation was
provided by an expectancy of the reinforcer that is aroused by cues that
predict it (e.g., Bolles & Moot, 1972). In some respects, this idea is actually
consistent with Tolman’s original view.
The Rescorla and Solomon 1967 paper is one of the inputs—along with
effects like blocking, contingency learning, and flavor aversion learning (see
Chapter 3)—that caused a major change in our thinking about Pavlovian
conditioning in the late 1960s. As you know by now, conditioning theory is
The Motivation of Instrumental Action   389

no longer merely the study of “spit and twitches.” Moreover, the idea that
a Pavlovian state or expectancy motivates instrumental action is consistent
with the evidence. If such a state really motivates behavior, we should be
able to show that increasing or decreasing the intensity of the state will
cause a corresponding increase or decrease in the vigor of instrumental
action. That is exactly what Pavlovian-instrumental transfer experiments
are designed to test. In these experiments, an instrumental behavior is
trained, and a Pavlovian CS is conditioned in a separate phase. In a final
test, the CS is then presented while the animal is performing the instru-
mental behavior. Presenting the CS at this time should affect the animal’s
expectancy of the reinforcer. If it does, and if that expectancy motivates,
we should see the vigor of the instrumental action change accordingly.
The idea is illustrated by a classic experiment run in Solomon’s lab by
Rescorla and LoLordo (1965); other experiments with a similar logic had
been run before (e.g., Solomon & Turner, 1962). In an initial phase, dogs
were trained to avoid a brief electric shock by shuttling across a barrier
in a shuttle box. Once the dogs were avoiding shock by responding at a
stable rate, they were confined to one side of the box and given Pavlovian
conditioning. One CS was conditioned as an excitor (it was paired with
shock), and another CS was conditioned as an inhibitor (several inhibitory
conditioning methods—see Chapter 3—were used). In the final test, the
dogs were put back into the shuttle box, where they began to shuttle again
to avoid receiving shock. Then Rescorla and LoLordo merely presented the
Pavlovian excitor and inhibitor and watched how they affected the dog’s
avoidance rate. As Figure 9.15 shows, avoidance responding increased in
the presence of the excitor and decreased below the baseline in the pres-
ence of the inhibitor. A conditioned motivational state—excited by excitors
and inhibited by inhibitors—thus appears to influence avoidance behavior.
Pavlovian-instrumental transfer (PIT) has since been shown in many
other situations. For example, an appetitive instrumental action can be
influenced by a Pavlovian appetitive state (e.g., Lovibond, 1983). The effect

26.0
Figure 9.15  A Pavlovian-instrumental transfer
experiment. Dogs that were responding to avoid
Responses per minute

an electric shock were presented with a fear


excitor (CS+) or a fear inhibitor (CS–) while they
15.0 exhibited avoidance behavior. (The CSs had been
conditioned in a phase separated from avoidance
training.) Presentation of the excitor enhanced
avoidance rate, whereas presentation of the
4.0 inhibitor reduced the avoidance rate. Results like
this one suggest that a conditioned motivational
state (“fear”) motivates the instrumental avoid-
Baseline CS+ CS – ance response. (After Rescorla & LoLordo, 1965.)
390  Chapter 9

TABLE 9.1 How Pavlovian states should influence instrumental action


Pavlovian state
Appetitive US (excitement) Aversive US (fear)
Instrumental action Excitor (CS+) Inhibitor (CS–) Excitor (CS+) Inhibitor (CS–)
Appetitve Increase Decrease
Increase Decrease Decrease Increase
(motivated by
anticipation of food)
Avoidance (motivated Decrease Increase Increase Decrease
by fear)
Source: After Rescorla & Solomon, 1967.

also occurs in humans, who (for example) have learned to press a button
to earn delivery of M&M chocolates. When presented with a visual cue
that was separately paired with M&Ms, instrumental responding increases
(Colagiuri & Lovibond, 2015; Lovibond & Colagiuri, 2013). In fact, PIT ef-
fects also occur with drug reinforcers. Rats performing instrumental lever
pressing for alcohol (e.g., Cunningham, 1994; Krank, 1989) or for cocaine
(LeBlanc, Ostlund, & Maidment, 2012) increase their rate of responding
when presented with a Pavlovian CS associated with the reinforcer; an in-
hibitor can also suppress the rate (Cunningham, 1994; Krank, 1989). Similar
effects have been shown in smokers responding for cigarettes: Pictures of
cigarettes increase the strength of the instrumental response (e.g., Hogarth
& Chase, 2011; see also Hogarth, Dickinson, & Duka, 2010).
Rescorla and Solomon (1967) predicted great generality to the idea that in-
strumental actions are influenced by a Pavlovian motivation state. They were
quite explicit that regardless of whether the instrumental action is motivated
by appetitive reinforcers or aversive events like shock, it should be influenced
by corresponding excitors and inhibitors. This idea is summarized in the two
shaded boxes in Table 9.1, which show instrumental actions are expected to
increase or decrease in the presence of a Pavlovian CS+ or CS– conditioned
with the same reinforcer or outcome. By further assuming that fear and ap-
petitive motivation inhibit each other (see also Dickinson & Dearing, 1979;
Konorski, 1967), Rescorla and Solomon made the second set of predictions
shown outside the shaded boxes. Although a fear excitor was supposed to
increase the motivation for avoidance behavior, it was predicted to inhibit
the motivation for appetitive behavior. This result itself is rather interest-
ing—Pavlovian CSs cannot just elicit peripheral CRs because whether they
increase or decrease the rate of an instrumental response depends on whether
the rats are responding to avoid shock or to obtain food (e.g., Scobie, 1972).
Many experiments on Pavlovian-instrumental transfer have been run,
and there are enough data that conform to the major Rescorla and Solo-
mon predictions to believe that the framework is basically correct. There
are some complications worth noting, though. The effects of inhibitors are
The Motivation of Instrumental Action   391

not as well documented as the effects of excitors. More important is that in


addition to their motivational effects, Pavlovian CSs also evoke behaviors
that can influence instrumental behavior in ways that have nothing to do
with motivation. For example, auditory fear CSs evoke freezing in rats,
and at least part of the suppression of appetitive performance that occurs
when they are presented is undoubtedly due to this effect (e.g., Bouton &
Bolles, 1980). Defensive behaviors evoked by a fear CS can also sometimes
interfere with avoidance behavior (Hurwitz & Roberts, 1977). In appeti-
tive experiments, localized CSs can evoke sign-tracking behaviors that can
either facilitate or suppress instrumental responding depending on where
the CS is located with respect to the operant response lever (Karpicke,
Christoph, Peterson, & Hearst, 1977). Appetitive CSs can also elicit goal-
tracking behaviors (see Chapter 3) that can interfere with performing the
instrumental behavior (e.g., Holmes, Marchand, & Coutureau, 2010). The
general point is that Pavlovian signals evoke specific behavioral CRs as well
as motivational states, which can make proof of true motivational interac-
tions quite tricky. A number of elegant experiments by J. Bruce Overmier
and his colleagues at the University of Minnesota nonetheless confirmed
interactions like the ones predicted in Table 9.1 in a way that cannot be
explained by simple mechanical response interactions (see Overmier &
Lawry, 1979; Trapold & Overmier, 1972).
General and specific outcome expectancies
Rescorla and Solomon’s idea was that the motivational effects of the CS
invigorate or energize the instrumental response. At about the same time,
Konorski (1967) presented a compatible view that has been adopted and
extended by others. In Chapters 4 and 5, we saw that a CS may actually be
associated with nodes corresponding to both the sensory and emotional
aspects of the US (this idea is at the heart of the “affective extension” of SOP,
or AESOP; Wagner & Brandon, 1989). When the CS activates the sensory
node, it evokes a sensory response (Konorski called it a “consummatory”
response), and when the CS activates the emotive node, it evokes an emo-
tional response (Konorski called that a “preparatory” response). In this kind
of scheme, the preparatory or emotive CS activates an entire appetitive or
aversive motivational system. Because the CS activates an entire system,
a CS should invigorate instrumental responding that is reinforced by any
reinforcer within the system. For example, Balleine (1994; see also Corbit
& Balleine, 2005; Holland, 2004) showed that presenting a CS associated
with liquid sucrose will enhance an operant response reinforced with food
pellets in hungry rats. Evidently, the motivation provided by a Pavlovian
CS can be fairly general over different reinforcers in the appetitive system.
When a Pavlovian CS invigorates responding for other reinforcers in the
same motivational system, it is called general Pavlovian-instrumental
transfer, or general PIT.
One important insight about the effects of CSs on instrumental ac-
tion, however, is that their effects are sometimes rather specific. That is,
392  Chapter 9

under some conditions, the CS mainly activates instrumental behaviors


that are associated with the same reinforcer (Colwill & Motzkin, 1994;
Colwill & Rescorla, 1988; Corbit & Balleine, 2005, 2011; Delamater, 1996;
Kruse, Overmier, Konz, & Rokke, 1983). For example, Corbit, Janak, and
Balleine (2007) studied PIT in hungry rats. In a Pavlovian training phase
(Figure 9.16A), they paired one CS with a food pellet, a second CS with
a few drops of liquid sucrose, and a third CS with a liquid starch solu-
tion (polycose). In a separate instrumental learning phase, the rats then
learned to press one lever for one of these outcomes (e.g., pellet) and a
second lever for a second outcome (e.g., sucrose). (The reinforcers were
different for different rats so that pellets, sucrose, and polycose were all
used equally.) When the three different CSs were then presented while the
rats were busy pressing the two levers while still hungry (Figure 9.16B),

(A)
Pavlovian training Instrumental training PIT tests – hungry PIT tests – not hungry

A — 01 R1 — 01 A: R1 or R2? A: R1 or R2?
B — 02 R2 — 02 B: R1 or R2? B: R1 or R2?
C — 03 C: R1 or R2? C: R1 or R2?

(B) Hungry (C) Not hungry


25 25
Baseline
Same outcome
20 20
Different outcome
Mean lever presses

Mean lever presses

General
15 15

10 10

5 5

0 0
Stimulus Stimulus

Figure 9.16  Outcome-specific and general Pavlovian-instrumental transfer (PIT).


In this experiment, all rats first learned to associate each of three CSs (A, B, and
C) with one of three different appetitive outcomes (O1, O2, or O3). In an instru-
mental training phase, the rats then learned to press two different levers (R1
and R2), each for one of the outcomes (O1 and O2). (B) In a test while hungry,
rats showed outcome-specific PIT: CSs A and B invigorated the response that
led to the same, but not the different, outcome. General PIT was also present:
The CS associated with O3 (CS C) invigorated both responses. (Panel C): In a
test while the rats were not hungry, general transfer disappeared, but outcome-
specific PIT remained. (After Corbit, Janak, & Balleine, 2007.)
The Motivation of Instrumental Action   393

a very interesting pattern emerged. When the CS signaled a reinforcer


that one of the responses earned, it caused the response that earned the
same reinforcer to increase above the baseline (“same outcome” in the
figure). Amazingly, the response that led to the different reinforcer did
not increase (“different outcome” in the figure). The specific nature of
this effect is called outcome-specific Pavlovian-instrumental transfer
or outcome-specific PIT. In this case, the CS specifically selects and
invigorates the response associated with the same reinforcer.
Interestingly, Corbit et al.’s (2007) experiment also demonstrated an
example of general PIT. That is, when the experimenters presented the
CS that had been associated with the third reinforcing outcome (the one
that neither response earned), this CS caused both responses to increase
(“General” in Figure 9.16B). Now here is a general motivating effect! It is
interesting that general PIT was restricted to the CS (CS C) that was asso-
ciated with O3. In contrast, CSs associated with O1 and O2 (A and B) did
not invigorate responses that produced O2 and O1—the different outcome
condition. The difference is probably that the rats had learned during the
instrumental training phase that R1 did not lead to O2 and R2 did not lead
to O1 (see Laurent & Balleine, 2015). In the absence of such knowledge, a
CS associated with a different outcome from the same motivational system
can excite instrumental responding—that is it can cause general PIT.
Notice further, though, that the general PIT effect disappeared when the
rats were tested again when they were not hungry (i.e., after being given
free food for 24 hours; Figure 9.16C). Also notice, however, that the specific
PIT effect remained; even when the rat was not hungry, presenting a cue as-
sociated with a specific food still led the rodent to respond for it (for related
findings, see Rescorla, 1994; Holland, 2004). Humans show the same effect.
For example, when they learn one instrumental response to earn M&Ms and
another to earn popcorn, Pavlovian cues paired with M&Ms versus popcorn
invigorate responding for the same (but not the other) outcome, and this
tendency continues even after they eat M&Ms or popcorn until they are
tired of them (Watson, Wiers, Hommel, & de Wit, 2014). And in experiments
with smokers, pictures of cigarettes or chocolate bars specifically enhance
responses reinforced with them—even after smoking a cigarette or eating
nearly three chocolate bars (Hogarth & Chase, 2011; see also Hogarth, 2012).
Thus, outcome-specific PIT occurs whether or not the subject is deprived of
the reinforcer. Pavlovian cues can thus instigate us to work for reinforcers—
like sweets or cigarettes—even when we do not “need” them.
How does PIT actually work? As noted above, general PIT might occur
because the CS excites the emotional (or motivational) node of the rein-
forcer representation (e.g., Konorski, 1967; Wagner & Brandon, 1989), which
arouses the general motivational system. (The appetitive system would pre-
sumably be suppressed when the animal is less hungry.) Outcome-specific
PIT, in contrast, might work because a CS also activates a representation of
the sensory properties of the reinforcer (e.g., Konorski, 1967; Wagner & Bran-
don, 1989), and this representation might in turn activate a response that
394  Chapter 9

has been specifically associated with it (Mackintosh & Dickinson, 1979; see
also Rescorla, 1994). Long ago, Trapold and Overmier (1972) alternatively
suggested that the CS might arouse an emotional state or expectancy that
might merely act as a discriminative stimulus (SD) and set the occasion for
the instrumental response. This kind of account can go some distance in
explaining the effects in Table 9.1 as well as outcome-specific PIT (see Figure
9.16) because specific reinforcer expectancies can set the occasion for specific
behaviors (e.g., Overmier, Bull, & Trapold, 1971; Peterson & Trapold, 1980;
Trapold, 1970). There are, however, results that seem inconsistent with
this view (e.g., Rescorla & Colwill, 1989), and it is not clear that it would
explain the finding that general PIT decreases when hunger is reduced (see
Figure 9.16; e.g., Corbit et al., 2007; Watson et al., 2014). Thus, the evidence
may favor the view that the CS can activate motivational (general PIT) as
well as sensory (outcome-specific PIT) aspects of the reinforcer. It is worth
noting, though, that PIT remains an active area of research.
What does it all mean?
Our discussion has gotten a little abstract and theoretical again, but the
issues do have practical implications. The main point of this section is that
detailed study of how rewards and punishers motivate instrumental be-
havior has led to a conclusion that you have been warned about all along:
Pavlovian learning is always occurring whenever instrumental or operant
conditioning is going on. Now we see that this second learning process can
modulate instrumental action. This point is important to remember when
you go about understanding behavior that you encounter, say, in the clinic.
For example, an obsessive-compulsive patient might be deathly afraid of
leaving his or her apartment without repeatedly checking to make sure that
all the electrical appliances are turned off. Checking behavior is an oper-
ant avoidance behavior that is thought to be reinforced by the reduction
of anxiety. According to what we have been discussing, checking behavior
will be exacerbated if CSs or cues in the environment make the individual
more afraid. Checking behavior would likewise be reduced by CSs or cues
in the environment that inhibit fear.
There are also clear implications for other instrumental behaviors, such as
eating food or taking drugs. General PIT suggests that we will be motivated to
seek food when we encounter cues associated with food—such as bright food
packaging, a place or context where we have eaten before, or perhaps a vivid
TV commercial. Outcome specific PIT in turn suggests that cues for specific
foods (a particular candy bar or a particular pizza) will enhance our tendency
to seek (respond for) and choose that food, even when we do not really need
it. When you think about it, PIT is probably the mechanism that underlies the
effects of advertisements. A commercial associated with a specific brand of
beer or pizza is supposed to make you want it and reach for the refrigerator
or the phone to call for free delivery. PIT is thus a very basic behavioral pro-
cess that probably helped enable the multibillion-dollar advertising industry.
The Motivation of Instrumental Action   395

As we already know, addictive drugs are also appetitive reinforcers,


and drug taking is an instrumental act that drugs reinforce. The idea that
we have been discussing is that CSs or cues in the background that are as-
sociated with a drug may increase or decrease the motivation to perform
the behavior (Stewart, de Wit, & Eikelboom, 1984). Cues predicting an
addictive substance thus increase the motivation to work for it (e.g., Cun-
ningham, 1994; Krank, 1989; LeBlanc et al., 2012). In humans, Pavlovian
cues for drugs (or food) elicit the feeling of craving (Carter & Tiffany, 1999;
Wray, Gobleski, & Tiffany, 2012) or wanting (e.g., Berridge, 1996; Robinson
& Berridge, 2003). These words are also used informally to describe our
subjective feeling of the incentive motivation that is evoked by Pavlovian
cues. In fact, the behavioral effects of Pavlovian cues associated with ad-
dictive reinforcers play a major role in modern neurobiological theories
of addiction (e.g., Everitt & Robbins, 2005; Flagel, Akil, & Robinson, 2009;
Milton & Everitt, 2010, 2012; Robinson & Berridge, 2003; Saunders & Rob-
inson, 2013; Stewart et al., 1984). These theories, which combine brain and
learning theory, emphasize that Pavlovian CSs (1) modulate instrumental
behavior through PIT (our focus here), (2) become conditioned reinforcers
that can strengthen new behaviors (see Chapter 7), and (3) elicit attention
and approach to them (sign tracking). CSs are said to acquire “incentive
salience,” the power to control behaviors such as attractive approach and
wanting (e.g., Berridge, 1996).
In fact, Terry Robinson and his colleagues argued that some individuals
are especially likely to attribute incentive salience to Pavlovian cues and
that those that do may be especially vulnerable to addiction (e.g., Flagel
et al., 2009; Saunders & Robinson, 2013). For example, studies with rats
suggest that some rats are especially likely to approach and interact with
Pavlovian cues that are associated with food (i.e., they sign track). Other
rats tend to approach the food cup (goal track)—or do neither of these
behaviors consistently—when they are exposed to these cues. Remarkably,
individual rats that seem to attribute the most incentive salience to the
food cues—that is, the ones that sign track—may be especially affected by
cocaine cues and cocaine reinforcers when they later learn an instrumental
response to self-administer cocaine (see Saunders & Robinson, 2013, for a
review). According to Saunders and Robinson (2013), the extent to which
humans find drug cues attractive may similarly predict their subjective
craving, drug use, and readiness to relapse. Classical conditioning pro-
cesses are thus thought to play a very significant role in motivated behavior
like addiction.
Another broad implication of our discussion is one that should be famil-
iar by now: Classical conditioning does not merely involve the attachment
of a specific response to a CS. Instead, the CS acquires the power to engage
a whole set—or system—of responses (see Chapter 5), and we now see
that the CS has the power to influence instrumentally learned actions too.
A kind of summary of how CSs associated with appetitive and aversive
396  Chapter 9

(A) Appetitive
Appetitive behaviors

Salivation
Central
CS+ appetitive
state Insulin secretion, etc.

CS – Instrumental appetitive behavior

(B) Fear
Defensive reactions (e.g., freezing)

Heart rate change


Central
CS+ fear
state Analgesia, etc.

CS – Instrumental avoidance behavior

Figure 9.17  One important effect of a CS is to excite or inhibit a motivational


state, whether appetitive (A) or fear (B). That state can, in turn, evoke a constel-
lation of behaviors connected with the state and also—as emphasized in this
chapter—modulate instrumental behavior.

outcomes can affect performance is presented in Figure 9.17. In addition to


evoking natural behaviors, the CS modulates learned instrumental actions.
It can do so by either energizing them (as in general PIT) or by selecting the
ones that lead to the outcome it predicts (outcome-specific PIT). The CS’s
modulation of instrumental actions can be seen as another way in which
stimulus learning helps the animal get ready to deal with motivationally
significant events (see Chapter 2).

Dynamic Effects of Motivating Stimuli


It is worth taking a closer look at how the emotional effects of rewards and
punishers can further unfold and change over time. This is what Richard
Solomon actually spent the last 20 years of his distinguished career think-
ing and writing about. His analysis revealed some very interesting general
features of the effects of rewards and punishers and how their effects might
change quite fundamentally with repeated exposure.
Opponent-process theory
Solomon’s opponent-process theory was originally worked out with
John Corbit (Solomon & Corbit, 1974; see also Solomon, 1980). It takes
its cue from “opponent processes” that you are already aware of in color
Richard Solomon vision. Figure 9.18 is a loose rendering of what you might experience if
The Motivation of Instrumental Action   397

+100 Figure 9.18  The standard pattern of


Off On Off
affective dynamics. When an emotion-
primary reaction Peak arousing event or stimulus happens
Intensity of

Adaptation
(turns “on”), there is an initial emotion-
al reaction that peaks and then adapts.
Then, once the stimulus has ended
Baseline (turns “off”), there is often an after-
0 reaction that is opposite in valence to
Decay the primary reaction. (After Solomon &
after-reaction

Corbit, 1974.)
Intensity of

Peak of
after-reaction

+100
Time

you sat down in a dimly lit room and looked at a bright red stimulus for
a minute or two. At the start of the stimulus, you would perceive a very
saturated red color, but as the stimulus continued, the intensity of the
perception might fade a bit. Then, if the stimulus were turned off and you
were shown a simple neutral white background, you might see a green
after-image. A similar peak, adaptation, and after-image would occur if
you were shown a green stimulus (this time, the after-image would be
red). With a yellow stimulus, the after-image would be blue, and with a
blue stimulus, the after-image would be yellow. You get the idea; color
perception is organized so that an opposite after-image occurs when the
stimulus is terminated after exposure.
Solomon and Corbit (1974) argued that the same pattern holds for emo-
tional stimuli. That is, exposure to any emotional stimulus creates an initial
emotional response followed by an adaptation phase and then an opposite
after-reaction when the stimulus terminates. Consider something fright-
ening like an encounter with a bear in the woods or a surprisingly difficult
final exam. You have an initial fear reaction, which may peak and then
adapt a bit as the encounter continues. Then, once the stimulus ends, you
have an opposite emotional reaction—you feel relieved and pretty darn
good. An analogous pattern occurs with positive events, like a euphoric
drug or an encounter with a boyfriend or girlfriend. Here, there is an initial
thrill or “rush” that peaks and then adapts a bit. When the encounter ter-
minates, you might also feel an opposite after-reaction—this time you may
feel a little bad. This standard pattern of affective dynamics rings true
of most emotional stimuli. The pattern fits some experiments in Solomon’s
laboratory on dogs’ reactions to shock, which showed the initial peak, ad-
aptation, and then after-reaction (Church, LoLordo, Overmier, Solomon, &
Turner, 1966; Katcher et al., 1969). It is also consistent with the emotional
reactions of skydivers on their initial jump (Epstein, 1967), who feel terror,
adaptation, and then relief when the whole thing is over.
398  Chapter 9

First few stimulations After many stimulations


A
Manifest a–b
a–b
affective 0
response
B

A a a
Underlying
opponent
processes 0
b
b
B

Emotional
event

Time Time

Figure 9.19  The standard pattern of affective dynamics changes with repeated
exposure to the emotional stimulus. Top: Behaviorally, one observes habitua-
tion of the primary emotional reaction and intensification of the after-reaction.
Middle: At a theoretical level, the emotion is controlled by the difference be-
tween two opposing processes. The “a-process” is closely linked in time to the
emotional event. The opponent “b-process” is more sluggish than the a-pro-
cess, which accounts for adaptation and the after-reaction. Unlike the a-process,
the b-process also strengthens with repeated exposure. This causes habituation
and intensification of the after-reaction. (After Solomon & Corbit, 1974.)

The standard pattern changes, though, with repeated exposure to the


emotional stimulus, and this is possibly the theory’s most significant in-
sight. Figure 9.19 (upper left) repeats the standard pattern of affective
dynamics (see Figure 9.18), the pattern observed during the first exposures
to an emotional stimulus. With repeated stimulations, though, the pattern
changes to the one shown in the upper right of Figure 9.19. The primary
affective reaction habituates: There is less terror connected with jumping
out of an airplane or less of a rush connected with taking the drug. Equally
important, however, is that the opposite after-reaction has strengthened
and deepened. Jumping out of an airplane now causes a longer and more
positive relief state. Conversely, taking the positive drug now causes an
intense and prolonged withdrawal state.
In the long run, repeated exposure to rewards or punishers can thus
cause a profound change in their motivational consequences. A negative
stimulus, like jumping out of a perfectly good airplane, can actually become
positively reinforcing because the negative affect eventually habituates and
the after-reaction comes to feel so good. Similarly, a positive stimulus may
start out as a positive reinforcer, but after a great deal of exposure, we may
begin to seek it, not because it makes us feel good, but because it helps us
escape from an aversive after-state. The change in motivational pattern
is consistent with what opiate users report in the early and late stages of
The Motivation of Instrumental Action   399

their addiction (O’Brien, Ehrman, & Ternes, 1986). At first, abusers may
take the drug because it makes them feel good, but later the rush seems
less important. “If [later-stage addicts] are asked why they use heroin, they
often still say, ‘to get high.’ But if they are asked when was the last time they
actually became high after an injection, it may have been months or years
in the past” (O’Brien et al., 1986, p. 333). Repeated exposure to rewards and
punishers can thus change their motivational effects on behavior.
The theory offers an explanation of how this change comes about, and
this is depicted in the middle part of Figure 9.19. The stimulus actually
elicits two psychological processes, and behind the scenes, they combine to
create the emotional dynamics that we see in behavior. The first process is
the a-process, which has the affective quality and intensity of the stimulus
itself. It is stable, quick, and synchronized with the onset and offset of the
emotional stimulus. The other process, the b-process or opponent pro-
cess, is a homeostatic adjustment response. It has the opposite emotional
quality and thus subtracts from the a-process. It is a “slave process” in the
sense that it is aroused by the a-process. Most important, the b-process is
sluggish; it is slow to recruit and reach its maximum, and then it is also
slow to decay. The emotion that we feel (shown in the upper panels of Fig-
ure 9.19) is the mathematical difference between the two processes (a – b).
Thus, the adaptation phase occurs as the b-process comes on slowly. And
the after-reaction occurs because the a-process turns off quickly, leaving
the slower b-process to decay on its own for a while.
The change in dynamics that occurs with repeated exposure comes
about because the b-process strengthens with repeated use. As the middle right
panel of Figure 9.19 illustrates, after repeated use, the b-process comes on
more quickly, reaches a deeper maximum, and is ultimately also slower to
decay. This change yields the habituation and withdrawal effects described
above. The a-process remains unchanged; the only thing that has changed
is b. The theory assumes that this type of acquired motivation (which is
acquired through experience) is an automatic consequence of use and is
not at all learned.
Emotions in social attachment
Some of the most intriguing tests of the opponent-process theory were
conducted by Howard Hoffman and his colleagues while they studied
the emotional dynamics of imprinting in ducklings. Filial imprinting is
the learning process by which young animals may become attached to
their mothers. Konrad Lorenz, the ethologist, wrote widely about the phe-
nomenon, and you may have seen photographs of him being followed by
a string of little goslings crossing a road. Within limits (Gottlieb, 1965),
young birds may become attached to many types of moving stimuli that
they experience at a young age. Although ethologists saw imprinting as a
unique learning process (e.g., Lorenz, 1937), it is probably another example
of the kind of associative learning that we study in Pavlovian learning.
Hoffman and Ratner (1973b) noted that very young ducklings (e.g., 17
400  Chapter 9

hours old) are comforted by stimuli that move. Stimulus movement is


thus a reinforcer and/or a positive US for a young duckling. Hoffman and
Ratner exposed ducklings to either a moving electric train or a spinning
light from a police car. When the stimuli were presented in motion, they
suppressed distress calls that the ducklings made when they were first put
into the apparatus. In contrast, if the stimuli were presented in a stationary
manner, they had no such calming effect. Repeated exposure to a stimulus
in motion, however, allowed the stimulus to suppress distress calling even
when presented in a stationary manner (Hoffman, Eiserer, & Singer, 1972).
The birds associated the static features of the imprinting stimulus (CS) with
its movement (US). These stimuli consequently become motivationally
significant, calming the birds down; they may also have begun to elicit
sign tracking—as in the goslings that followed Lorenz around.
The emotional dynamics of imprinting fit opponent-process theory.
Hoffman and Ratner (1973b) measured 17-hour-old ducklings’ distress call-
ing before, during, and after the ducklings’ first 10-minute exposure to a
moving HO train. The results are shown in Figure 9.20. The group shown

(A) Novel environment


5.0
Minutes of distress calling

2.5

0
Figure 9.20  Distress calling in ducklings before, Pre-stimulus Stimulus Post-stimulus
period period period
during, and after exposure to a moving toy train.
(A) When 17-hour-old ducklings are brought into a
novel environment, there is considerable distress (B) Familiar environment
calling that is suppressed by presentation of the 5.0
moving train (stimulus period). When the train is
Minutes of distress calling

then removed from sight, the birds begin distress


calling again. (B) When the duckling has spent
most of its 17 hours of life in the test environment,
the familiar environment evokes no distress call- 2.5
ing, and presentation of the moving train has no
obvious effect. When the train is then removed,
however, distress calling begins. Thus, although
the train is soothing to a distressed animal (A),
its removal causes an aversive after-reaction that
0
makes even a calm duckling distressed. (After Pre-stimulus Stimulus Post-stimulus
Hoffman & Ratner, 1973b.) period period period
The Motivation of Instrumental Action   401

at the top was put into the apparatus for the first time immediately before
the whole experience began. The ducklings were distressed at first, then
they calmed down when the moving train was presented, and then they
became distressed again when the train was turned off and withdrawn.
The interesting group, however, is in the lower panel of Figure 9.20. These
ducklings were actually hatched in the imprinting apparatus, where they
were left undisturbed until testing began 17 hours later. This group showed
no distress prior to presentation of the moving train—they were used to
the test environment. But after 10 minutes of exposure to the moving train,
turning it off caused major distress. Hoffman and Ratner had made calm
ducklings miserable by giving them the “positive” imprinting experience.
Here is real acquired motivation—the ducklings were doing just fine until
the imprinting stimulus was presented and then withdrawn. In terms of the
theory, the distress is the after-reaction caused by the b-process lingering
after removal of the positive train stimulus.
Once the birds have imprinted, brief exposures to the train stimulus
can reinforce operant responses, such as pecking at a pole. In other experi-
ments, Eiserer and Hoffman (1973) tested pole pecking after the birds were
“primed” by free exposures to the train. These primes stimulated pole
pecking after the stimulus, presumably by motivating the bird to escape
the aversive after-reaction discussed above. The amount of pecking also
increased as a function of the duration of the prime. In effect, the stronger
opponent process elicited by the longer primes (remember that the b-pro-
cess is slow to recruit) motivated instrumental behavior especially well.
Working in Solomon’s laboratory, Starr (1978) discovered something im-
portant about the growth of the b-process during imprinting. He took very
young ducklings and gave them 30-second exposures to a moving, stuffed
duck. Different groups of ducklings received their exposures separated by
1-minute, 2-minute, or 5-minute intervals. Distress calling increased after
each successive termination of the imprinting stimulus, but its growth
was most impressive the shorter the interstimulus interval (Figure 9.21);
at the longest interval, there was little growth. This kind of result suggests
that the growth of the opponent process does not merely depend on “re-
peated use” (Solomon & Corbit, 1974), but actually depends on repeated
exposure under massed conditions. Starr suggested that there is a “critical
decay duration”—that is, a crucial interval between successive stimulus
presentations beyond which the b-process will not grow. Seaman (1985)
reported compatible results for the acquisition of morphine tolerance. The
implication for addiction is that massed exposure to a biologically signifi-
cant stimulus, as might happen in a binge, may cause more immediate risk
for addiction than the same number of exposures that are spaced more
widely in time.
A further look at addiction
How does opponent-process theory hold up? The theory provides a sig-
nificant insight into an aspect of motivation that had been ignored be-
402  Chapter 9

60

30 5-minute
intervals

60
Mean duration of distress calling (s)

30 2-minute
intervals

60

30 1-minute
intervals

60

One continuous
30
exposure

0
B 1 2 3 4 5 6 7 8 9 10 11 12
Successive periods of stimulus absence

Figure 9.21  Distress calling in ducklings after 30-second exposures to a moving,


stuffed duck. The growth of distress calling (the after-reaction) was most pro-
nounced when exposures were massed in time (i.e., separated by 1 or 2 min-
utes). There was little growth when exposures were separated by 5 minutes. The
bottom panel is a group that received one continuous 6-minute exposure. (After
Starr, 1978.)

fore—that the motivating effects of rewards and punishers can change in a


big way with repeated exposure (e.g., see also Koob & Le Moal, 2008). A lot
of a good thing can make the good thing bad, and a lot of a bad thing can
make the bad thing better. Because of the growth of the opponent process,
an addiction syndrome can kick in; the organism may make the transition
from casual to compulsive-looking instrumental behavior (see Robinson &
Berridge, 2003). When opponent-process theory was first proposed in the
1970s, it added to a growing sense of excitement about the role of opponent
or “compensatory responses” in drug tolerance and drug dependence (see
Chapters 2 and 5). Today, however, there is less certainty that the growth
of the b-process is automatic and unlearned, as the original theory pro-
The Motivation of Instrumental Action   403

posed. Instead, most researchers are now convinced that the growth of the
b-process is at least partly governed by Pavlovian conditioning.
You might remember Siegel’s work (e.g., 1975) suggesting that repeated
pairings of a CS with a drug allow the CS to elicit responses that com-
pensate for unconditional effects of the drug (see Chapters 3 and 5). The
compensatory CR is essentially an opponent process—it subtracts from
the effect of a biologically significant outcome. Compensatory responses
are also linked theoretically with withdrawal responses (e.g., Siegel et al.,
2000), and they are not uncommon in conditioning (see Chapter 5). Notice
further that conditioned compensatory responses will grow with repeated
exposure to the US; in this case, though, their growth would be the result
of learning, and the full-blown opponent process would be seen specifi-
cally in the presence of the CS. There is now so much evidence suggest-
ing a role of conditioning in the development of drug tolerance (and so
comparatively little evidence that tolerance is merely automatic and not
learned) that the “associative” (learned) mechanism seems to have won the
day. The implication is that the opponent process will mainly be elicited
by CSs associated with the drug. CSs will motivate behavior, an idea that
we have clearly seen before, but the new twist is that CSs might motivate
in part because they trigger the opponent b-process.
The idea that CSs might elicit a b-process had an effect on theories
that followed the original opponent-process theory. Jonathan Schull (1979)
combined Solomon’s theory with a modern understanding of conditioning.
Schull argued that during conditioning, CSs come to elicit a b-process that
cancels the effect of the upcoming US. Schull also noted that blocking ef-
fects (L+, then LN+) can be explained if the CS doing the blocking cancels
the effectiveness of the US by eliciting the opponent process. The idea has
implications that are similar to the more conventional view that the CS re-
duces the US’s “surprisingness” (see Chapter 4), although it does not imply
a cognitive information processing system. Wagner’s SOP (“sometimes
opponent process”) theory has some of the same character. The amount
of conditioning that a US creates will depend on the extent to which its
elements go into the A1 state. A CS prevents that by putting the elements
into the A2 state instead. As we saw in Chapter 5, the response controlled
by A2 sometimes looks like an opponent process (Paletta & Wagner, 1986).
These models thus show how the growth of an opponent process might be
linked to classical conditioning.
One challenge for the idea that the growth of the opponent process
depends exclusively on conditioning is the finding that massed exposure
is especially effective at increasing withdrawal reactions or tolerance (Sea-
man, 1985; Starr, 1978; see Figure 9.21). Conditioning is usually worse with
massed trials, not better (see Chapter 3). So, how can we put all the facts
together? It seems possible that both learned and nonlearned mechanisms
influence habituation and withdrawal. For example, research on the ha-
bituation of startle reactions suggests that there is a short-term habituation
effect that develops best with massed trials and a longer-term habituation
404  Chapter 9

effect that develops best with spaced trials (Davis, 1970). Such findings
are consistent with Wagner’s (1978, 1981) distinction between habitua-
tion resulting from self-generated and retrieval-generated priming (see
Chapter 4). Baker and Tiffany (1985) showed how this distinction can be
applied to drug tolerance (see also, e.g., Tiffany, Drobes, & Cepeda-Benito,
1992). Thus, there are reasonable grounds for thinking that both learned
and nonlearned factors might contribute to habituation and tolerance to a
biologically-significant outcome.
So what does all this mean? The opponent process, or the strength of the
opposite after-reaction, may grow for two reasons. First, massed exposures
to a positive reinforcer may cause some short-term, nonassociative growth,
leaving the organism craving more after a series of quickly repeated stimu-
lations. A weekend binge can thus generate a lot of motivation for more of
that reinforcer, although this might fade reasonably quickly after the last
reinforcer exposure. Second, classical conditioning may cause a longer-term
growth of an opponent process that will be elicited by the CSs associated
with the reinforcer (“longer-term” because CSs can still elicit CRs even
after very long retention intervals). This kind of growth may depend on
more spaced exposures to the outcome and will follow the laws of condi-
tioning we discussed in earlier chapters. Opponent processes that grow
through either mechanism will presumably motivate instrumental actions
reinforced by the O. The conditioning mechanism, though, will allow an
enduring form of motivation that will always be potentially elicited by
cues that have been associated with O.
Conclusion
The opponent-process approach emphasizes withdrawal-like effects as the
motivational basis for addictive behavior. After repeated exposure to a
drug, the strong b-process causes a withdrawal state—an acquired need
for the drug—that is aversive and motivates the user to escape. One im-
plication is that the addict might not be motivated to take the drug until
he or she goes into the withdrawal state. There is a similarity here to the
depletion-repletion account of hunger, where motivation was likewise seen
as a response to need.
This point brings our chapter full circle. At the start, we learned two
important lessons about how need states like hunger and thirst actually
motivate. First, need states do not automatically energize instrumental ac-
tion; instead, the organism must first learn about the effect that an outcome
has on the need state (e.g., Balleine, 1992)—the process called “incentive
learning.” In a similar way, if a withdrawal state is going to motivate drug
seeking, the organism might first need to learn that the drug makes him or
her feel better in the withdrawal state (see Hutcheson, Everitt, Robbins, &
Dickinson, 2001). Second, other parts of the chapter raised doubts about
whether animals wait until they need food or water to engage in motivated
behavior: Much eating and drinking actually occurs in anticipation of—
rather than in response to—need. The same may be true of drug taking.
The Motivation of Instrumental Action   405

Drug abusers may normally crave or seek drugs before they physically
“need” them; incentive factors, rather than need factors, often predomi-
nate (Robinson & Berridge, 2003; Stewart et al., 1984). As noted earlier,
these incentive factors—like recent tastes of the drug or exposure to drug-
associated CSs—can motivate instrumental behavior by creating a kind of
“wanting” for the drug (Berridge & Robinson, 1995; Robinson & Berridge,
1993, 2003; Wyvell & Berridge, 2000). (“Wanting” the drug, however, can
be distinguished from actually “liking” the drug, which involves separate
incentive learning processes; Balleine, 1992; Dickinson & Balleine, 2002.)
There is plenty of food for thought, and room for new research, about is-
sues surrounding the motivation of instrumental action.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. Early thinking about motivation and behavior linked motivation to
biological need. Hull emphasized Drive, a form of motivation that was
caused by need. Drive was supposed to energize consummatory behav-
ior, random (“general”) activity, and instrumental action.
2. The Drive concept ran into trouble. Drive does not energize activity in
a random or general way; instead, hunger and thirst seem to select or
potentiate behavior systems that are designed (by evolution) to deal
with the motivational state. Drive does not blindly energize instrumental
action either. Motivational states influence instrumental behavior only if
the animal has had a chance to learn the reinforcer’s value in the pres-
ence of the motivational state. The latter process is called “incentive
learning.”
3. Eating and drinking seem to anticipate—rather than be a response to—
need. For example, animals drink or forage for food in ways that seem
to prevent them from becoming depleted. This process usually involves
learning, and what we eat and when we eat it (for example) are strongly
influenced by learning processes.
4. Instrumental behavior is motivated by the anticipation of reward.
Upshifts and downshifts in the size of reward cause positive and nega-
tive “contrast effects.” These involve emotion, and they suggest that
the motivating effects of a reward depend on what we have learned to
expect. The anticipation of reward causes “incentive motivation,” which
Hull added to his theory. The bottom line was that a classically condi-
tioned anticipatory goal response, rG, was thought to energize instru-
mental action.
5. There are other conditioned motivators besides rG. Avoidance learning
is motivated by fear, or rE. When rewards are smaller than expected,
406  Chapter 9

frustration, rF, becomes important. Frustration is especially useful in ex-


plaining many “paradoxical reward effects” in which reinforcers are less
positive than our intuitions suggest they should be.
6. Extrinsic rewards (like prizes or money) can sometimes hurt human
performance that is said to be “intrinsically” motivated. The effect is
restricted to certain situations. Like other paradoxical reward effects, it
is consistent with the idea that the effects of reinforcers can depend on
expectations and psychological context.
7. The partial reinforcement extinction effect (PREE) is an especially impor-
tant paradoxical reward effect. Behaviors that are reinforced only some
of the time are more resistant to extinction than those that are always
reinforced. Behavior may be more persistent after partial reinforcement
because we have learned to respond in the presence of frustration. Al-
ternatively, partial reinforcement may make it more difficult to discrimi-
nate extinction from acquisition. The latter idea is refined in “sequential
theory.”
8. It was difficult to confirm a role for peripheral responses like rG and rE in
motivating instrumental behavior. Pavlovian-instrumental transfer (PIT)
experiments, however, demonstrate that presenting a Pavlovian CS
while an organism is performing an instrumental action can influence in-
strumental performance. The motivating effects of rewards and punish-
ers are thought to be mediated by classically conditioned expectancies
or motivational states, not by peripheral responses. In any instrumental
learning situation, cues in the background can become associated with
O and thereby motivate instrumental action.
9. PIT can take two forms. In general PIT, a CS that is associated with a
reinforcer can excite or invigorate an instrumental response that has
been reinforced by any outcome within the same motivational system
(e.g., different types of foods). In outcome-specific PIT, a CS excites or
invigorates an instrumental response that is specifically associated with
the same outcome. Pavlovian CSs can thus influence choice between
different instrumental behaviors, and outcome-specific PIT can insti-
gate organisms to make an instrumental response even when they are
satiated. These effects of Pavlovian cues are another reason that cues
associated with reinforcers can cause people to work for food or take
drugs, even when they do not “need” them.
10. The motivational effects of rewards and punishers can further change as
a function of experience with them. Exposure to an emotional stimulus
can cause an opposite after-reaction when the stimulus is withdrawn.
With repeated exposure to the emotional stimulus, the after-reaction
may also get stronger while the original emotional effect habituates.
According to opponent-process theory, this change occurs because an
opponent process elicited by the stimulus grows with repeated use.
Ultimately, the change can cause a reversal of the motivation behind
The Motivation of Instrumental Action   407

instrumental action. For example, although a positive stimulus is a


positive reinforcer at first, we may eventually seek it so as to escape the
strong aversive after-reaction. This may be a hallmark of addiction.
11. Opponent-process theory explains the emotional dynamics of imprint-
ing. The growth of opponent processes may depend more on learning
than the theory originally supposed, however. Conditioned compensa-
tory responses, which are essentially conditioned opponent processes,
may play a role in tolerance and habituation, although a growth of the
opponent process like the one envisioned by opponent-process theory
may still occur as a consequence of massed exposures to a significant
outcome.

Discussion Questions
1. Early motivation theorists thought that motivational states like hunger
and thirst provide energy that blindly energizes or invigorates behavior.
What is the evidence for and against this idea? Given what you now
know, how would you say that a motivational state like hunger affects
instrumental behaviors that lead to food?
2. According to the concept of incentive learning, organisms learn about
reinforcers and how they make them feel. Discuss the evidence for this
concept. How can you use what you know about incentive learning to
improve the clinical treatment of, say, a patient with an anxiety disorder
who might benefit from taking a new medication? How about a person
with mild depression who might benefit from becoming more active
socially or getting more exercise?
3. According to early research from Tolman’s laboratory—for example, the
latent learning experiment (Tolman & Honzik, 1930) and Tinklepaugh’s
bait-and-switch experiment with monkeys (Tinklepaugh, 1928)—reward-
ing outcomes motivate behavior and do not merely reinforce. Where
did this idea go from there? Use concepts like incentive motivation,
paradoxical reward effects, and Pavlovian-instrumental transfer to give a
modern view of how rewards motivate.
4. Use opponent-process theory along with the other various motivational
processes discussed in this chapter to provide a general account of
why people can become so strongly motivated to eat junk food or take
drugs like cocaine, opiates, and alcohol.
5. Early in this book, I warned you that even though Pavlovian condition-
ing and instrumental learning are usually discussed separately, they
often work together and interact. Given what you have learned in this
chapter, how would you say that Pavlovian conditioning processes influ-
ence instrumental learning and instrumental behavior?
408  Chapter 9

Key Terms
a-process  399 incentive learning  368 partial reinforcement
acquired incentive extinction effect
motivation  376 motivation  379 (PREE)  384
after-image  397 negative contrast Pavlovian-instrumental
after-reaction  397 effect  377 transfer  389

b-process  399 opponent process  399 positive contrast


opponent-process effect  377
concurrent
measurement theory  396 rG-sG mechanism  379
studies  388 outcome-specific sequential theory  385
frustration  380 Pavlovian- specific hungers  364
general Pavlovian- instrumental transfer standard pattern
instrumental transfer (outcome-specific of affective
(general PIT)  391 PIT)  393 dynamics  397
homeostasis  365 paradoxical reward
effects  381
imprinting  399
Chapter Outline
Avoidance Learning  412 Punishment 442
The puzzle and solution: Two-factor Summary: What does it all mean?  445
theory 412 A Cognitive Analysis of
Problems with two-factor theory  415
Species-specific defense reactions  420 Instrumental Action  445
Cognitive factors in avoidance Knowledge of the R-O relation  446
learning 426 Knowledge of the S-O relation  452
Learned helplessness  431 S-(R-O) learning (occasion setting)  454
Summary: What does it all mean?  436 S-R and “habit” learning  456

Parallels in Appetitive Summary 461


Learning 436 Discussion Questions  463
The misbehavior of organisms  436
Superstition revisited  437 Key Terms  464
A general role for stimulus learning in
response learning situations  440
chapter

10
1
A Synthetic Perspective
on Instrumental Action

Ininstrumental
this final chapter, we will conclude our discussion of
learning by describing what I consider the
contemporary, synthetic perspective on the problem.
This perspective began to develop in the late 1970s
and 1980s and continues to develop today, although it
seems somewhat underappreciated by many psychol-
ogists, who often assume that the study of operant
behavior ended with the behavior analytic approach
covered in Chapter 7. The synthetic approach draws
on material covered in Chapter 7, but it also draws
on Chapters 8 and 9 as well as material throughout
the rest of this book. It can be seen as a “cognitive”
approach to instrumental learning because so much of
what we now think is learned in even simple operant
situations is not directly manifest in behavior. It is also
biological because we will be reminded that evolution-
ary factors that shape behavior are not exactly irrel-
evant either.
My version of the story behind the contemporary
synthesis begins with a review of avoidance learning.
It seems obvious that organisms must be able to learn
to avoid dangerous, aversive, noxious events—espe-
cially events that can do them bodily harm. Although
avoidance learning was mentioned when we first
considered the law of effect many chapters ago (see
Chapter 2), I have resisted (“avoided”) talking about
it in any depth because it touches on many themes
that first needed to be developed in other chapters.
It is also especially good for helping integrate what
412  Chapter 10

we have been talking about throughout this book. So, this chapter begins
with a discussion of avoidance learning. Its overall goal, though, is to sum-
marize and reassemble some of the themes that we have covered in differ-
ent parts of the book. The whole may be more than the sum of its parts.

Avoidance Learning
The puzzle and solution: Two-factor theory
Avoidance learning has always been a puzzle to learning theorists for one
simple reason. In other operant learning situations, it is easy to identify
an event that reinforces the behavior: For the rat pressing a lever in the
Skinner box, it is the pellets that are delivered as a consequence of the
lever-press response. The reinforcer in avoidance learning is much less
obvious, however. For example, one way that avoidance learning is stud-
ied in animals is with the use of an apparatus called a shuttle box (Figure
10.1). In this situation, a rat is trained to avoid a brief electric shock by
running from one side of a compartment to the other, on each trial. Once
the animal has learned to shuttle in this manner, it will make the response
repeatedly without getting shocked again. So, what keeps the shuttle
behavior going? Saying something like “the fact that the shock does not
happen” is not very helpful because a very large number of things do
not happen whenever the rat shuttles. The animal is not hit by a black
SUV or a laser beam blast from an alien spacecraft either. The absence of
these events cannot explain the behavior. How can the nonoccurrence of
an event be a reinforcer?

Figure 10.1  The shuttle box, an apparatus used in many


studies of avoidance learning in rats. (After Flaherty, 1985.)
A Synthetic Perspective on Instrumental Action   413

A theoretical answer to this question was provided by O. H. Mowrer many


years ago (Mowrer, 1939, 1942). Mowrer’s two-factor theory is still very in-
fluential today, and it combines many of the pieces of instrumental behavior
that we have been discussing in the last few chapters. Put casually, Mowrer
noted that in any avoidance situation there are usually cues or warning sig-
nals in the environment that tell the organism that an aversive event is about
to happen. In the shuttle box experiment, each trial typically starts with the
presentation of a stimulus like a buzzer. If the rat does not shuttle soon after
O. H. Mowrer
the buzzer comes on, it will be shocked. But if it shuttles during the buzzer,
shock is avoided, and the buzzer is usually turned off until the next trial.
There are really two kinds of trials that occur in avoidance learning
experiments (Figure 10.2). In one type of trial—which is typical early
in training (before the animal has learned the avoidance response)—the
warning signal comes on and is paired with shock. Mowrer (1939, 1942)
noted that these trials are classical fear conditioning trials, and the result of
several buzzer-shock pairings is that the buzzer becomes a CS that elicits
fear. This is the first factor in two-factor theory: Pavlovian fear conditioning
of warning stimuli. This process then enables something else to occur. If the
rat now happens to make the shuttling response during the presentation
of the buzzer, the buzzer is turned off. Because the response terminates the
fear CS, it is associated with fear reduction. This is the second factor in the
theory: reinforcement of the instrumental response through fear reduction.

Two-factor theory
1. Early trials Factor 1
Pavlovian fear conditioning
Warning signal
Shock
Response

2. Later trials Factor 2


Reinforcement through
Warning signal fear reduction
Shock
Response

Time

Figure 10.2  The two factors in Mowrer’s two-factor theory of avoidance learn-
ing. On early avoidance learning trials, before the organism knows how to avoid,
there are trials in which stimuli in the environment are paired with the to-be-
avoided event (an electric shock). Because of Pavlovian fear conditioning, those
stimuli become warning signals and arouse fear. (The animal is usually able to
terminate the shock by making a response, as shown.) On later trials, when the
organism makes the response that avoids the shock, the response also termi-
nates the warning signal, which allows reinforcement through fear reduction.
414  Chapter 10

Notice that although the shuttle box experiment is artificial, Mowrer’s idea
has broad implications. We perform avoidance behaviors because they re-
duce or escape anxiety and fear. And, to step back even further, Mowrer’s
resolution of the avoidance learning puzzle is that Pavlovian conditioning
and instrumental learning are happening and interacting at the same time.
The Pavlovian fear process is essential here because it allows the reinforce-
ment of avoidance learning.
Mowrer’s ideas were supported by several early experiments. In one
experiment, Mowrer and Lamoureaux (1942) had two groups of rats in
a shuttle box learn to avoid whenever a buzzer came on. For one of the
groups, making the avoidance response turned the buzzer off—the avoid-
ance response was therefore associated with immediate fear reduction. For
the other group, the shuttling response avoided the shock, but the buzzer
remained on for another 10 seconds after the response. Here, fear reduc-
tion eventually occurred, but it was considerably delayed, and there was
theoretically less reinforcement. Consistent with the theory, rats in the first
group (with the immediate buzz-off contingency) learned better than the
second group. It appeared that turning the buzzer off immediately was
necessary for good avoidance learning, presumably because it provided
the reinforcement.
The idea was also supported by another experiment by Neal Miller
(1948; see also Brown & Jacobs, 1949). Miller’s is one of those rare experi-
ments that is so famous, it has its own name: the “acquired drive experi-
ment.” It separated the Pavlovian and instrumental processes into two
phases. Rats were first put into the white side of a box (that had wide
and black sides), where they received an electric shock a few times. These
shocks arguably conditioned fear to the white part of the box. Miller then
allowed the rats to escape from the white compartment to the black com-
partment by performing a new instrumental response. Initially, he merely
opened a door in the wall separating the two compartments and allowed
the rats to run from the white compartment to the black compartment,
which the rats learned to do readily. Then, he required them to turn a small
wheel above the door to open the door between the compartments—most
of the rats learned this response, too. Finally, the rats learned to press a
lever to open the door instead of turning the wheel. At this point, turning
the wheel extinguished and was replaced by lever pressing. All these in-
strumental responses were learned in the absence of any additional shocks.
Apparently, simply escaping from the white cues was sufficient to reinforce
new instrumental behaviors. Conditioned fear thus served as an acquired
drive—that is, once acquired through conditioning, it reinforced new be-
havior through a kind of drive reduction (see also McAllister & McAllister,
1971, 1991).
The idea that people also learn behaviors that escape or reduce nega-
tive emotion is a pervasive one in clinical psychology. For example, many
accounts of drug abuse propose that drug taking is reinforced by escape
A Synthetic Perspective on Instrumental Action   415

from the negative affect that is generated by the drug (e.g., Baker, Piper,
McCarthy, Majeskie, & Fiore, 2004). The connection is especially clear with
anxiety disorders. For example, humans who acquire panic disorder with
agoraphobia stay at home and avoid going to places (like shopping malls)
that might evoke fear or anxiety. Presumably, one reason they do so is that
they have learned to escape learned anxiety or fear. Similarly, you might
have encountered someone with an obsessive-compulsive disorder who
(for example) obsessively washes his or her hands in the bathroom of the
dorm. One account of such behavior (e.g., Mineka & Zinbarg, 2006) is that
the person is washing his or her hands to reduce a learned fear of contami-
nation. Finally, you might know someone with bulimia nervosa, the eating
disorder in which a person eats excessively and then vomits afterward. One
explanation (e.g., Rosen & Leitenberg, 1982) is that food and eating have
become associated with fear about getting fat. Vomiting is then reinforced
by anxiety reduction—one eats and then purges the food to feel better. One
implication is that these various disorders can be helped by extinguishing
the anxiety that provides their motivational basis. For example, Leitenberg,
Gross, Peterson, and Rosen (1984) had people with bulimia eat and then
forced the extinction of anxiety by preventing the vomiting response (this
technique is called response prevention). This method reduced the anxiety
as well as the bulimic symptoms.
Problems with two-factor theory
By the early 1950s, things were looking good for Mowrer’s theory, and
several later writers continued to refine it (e.g., Levis & Brewer, 2001;
McAllister & McAllister, 1991). The theory encountered some challenges
along the way, however. The theory has two main predictions, both of
which have stimulated research. One prediction is that if fear motivates
avoidance behavior, the strength of the avoidance response should cor-
relate with the amount of fear that the subject shows in the situation.
Another prediction is the one pursued by the Mower and Lamoureaux
(1942) and Miller (1948) experiments: Terminating the warning signal
(and escaping fear) should play a major role in allowing avoidance
learning.
We considered the first prediction—that fear correlates with the strength
of the avoidance response—in Chapter 9. You will remember that many ex-
periments failed to discover the kind of correlation that Mowrer predicted
between overt signs of fear and avoidance responding. Dogs avoiding in
the shuttle box do not always look afraid, and their heart rates—a possible
overt sign of fear—do not correlate very well with avoidance either. Other
experiments removed the warning signal from the experimental chamber
and tested it elsewhere for how much fear it evoked. For example, Kamin,
Brimer, and Black (1963) tested the CS in a conditioned suppression situ-
ation (in which it was probed on a lever-pressing baseline reinforced by
food) and found that the CS evoked relatively little fear after the rats had
416  Chapter 10

Figure 10.3  Fear of the warning signal declines 0


after extensive avoidance training. In this ex-
periment, rats received avoidance training in a

Adjusted suppression ratio


shuttle box until they made 1, 3, 9, or 27 avoid- –.05
ance responses in a row (the avoidance “crite-
rion”). Then the warning signal was tested for
its ability to evoke conditioned suppression: It –.10
was presented while the rats pressed a lever in
a separate Skinner box. The adjusted suppres-
sion ratio is the difference between suppression –.15
during the test and during a test that preceded
avoidance training; fear is represented by a
lower score. (After Kamin et al., 1963.)
–.20
1 3 9 27
Avoidance criterion

received extensive avoidance training (Figure 10.3; see also Cook, Mineka,
& Trumble, 1987; Mineka & Gino, 1980; Starr & Mineka, 1977). Thus, there
seemed to be relatively little fear of the CS when the avoidance response
was very strong. These problems are not necessarily fatal to two-factor
theory, though. For example, after extensive training, a large amount of
fear reduction (and thus a large amount of fear) might not be necessary
to maintain the behavior (the situation might be different when you are
first learning the response). In addition, later in this chapter we will see
that, after extensive training, animals may perform instrumental behaviors
simply out of habit. (In the presence of the shuttle box or the buzzer, they
may reflexively perform the response.) Fear and fear reduction might not
be as important once the response is a habit. Furthermore, in Chapter 9,
we discussed the current view of Pavlovian-induced motivation, which is
that CSs excite central states rather than peripheral responses (e.g., Rescorla
& Solomon, 1967). By and large, the idea that fear motivates avoidance
behavior is consistent with Pavlovian-instrumental transfer experiments
indicating that the presentation of Pavlovian fear excitors and inhibitors
can increase or decrease avoidance rate. Thus, the idea that fear motivates
avoidance behavior is on reasonably firm ground.
On the other hand, there has also been controversy about the role of
warning signal termination, the crucial event that is supposed to provide
reinforcement through fear reduction. First consider an interesting series
of experiments that came out of the behavior analytic (i.e., Skinnerian)
tradition. Initially, many radical empiricists had trouble accepting the
concept of “fear” but essentially adopted a two-factor theory in which
“aversiveness” instead of fear was conditioned to the warning signal and
provided reinforcement when the aversiveness was reduced (Schoenfield,
1950). Murray Sidman (1953) invented an avoidance learning procedure
that had a huge influence. This procedure, called “Sidman avoidance” or
A Synthetic Perspective on Instrumental Action   417

Shock S-S S-S S-S

R-S R-S
Response

Time

Figure 10.4  A Sidman, or free-operant, avoidance procedure. Brief electric


shocks occur at regular intervals (the S-S interval) unless a response is made—
which initiates a longer interval (the R-S interval). Notice that each response
resets the R-S interval. Animals can make the response whenever they “want”
to and learn to avoid shock even though there is no explicit warning signal.

Q: With the new positioning of the Shock dots, how does this
effect the alignment with the R-S lines and brackets? I’ve adjusted a bit.
“free-operant avoidance,” was important because it eliminated the warning
signal. Sidman set things up so that the rat received a brief electric shock
at regular intervals unless it made a lever-press response. The procedure
is illustrated in Figure 10.4. If the rat did not press the lever, it received
shocks at a standard interval (the “shock-shock” interval) of, say, one shock
every 5 seconds. However, if the rat happened to press the lever, the re-
sponse initiated a “response-shock” interval that could be, for example, 20
seconds. If the response-shock interval was longer than the shock-shock
interval so that the response produced a period of time that was free of
shock, the rat would learn to press the lever. The response is a free operant
because, like the typical lever press for food, the rat can make the response
whenever—and as often as—it wants to. The fact that a warning signal is
not necessary to make avoidance learning happen raised doubts about
two-factor theory because without a warning signal there would seem to
be little opportunity for reinforcement by fear reduction.
Unfortunately, although Sidman’s procedure eliminated all explicit sig-
nals for shock, it is not difficult to imagine the animal using time as a CS
for shock. Anger (1963) pointed out that the animal needs only to sense
the time that has passed since it made its last response. Timing processes
were discussed in Chapter 8. In modern terms, we might suppose that the
response starts a clock and that the next shock becomes more and more
likely as the clock ticks along. The passage of time would thus elicit in-
creasing fear, and if the next response were to reset the clock, the response
would be associated with fear reduction. Thus, a simple two-factor account
of the Sidman avoidance procedure is possible.
At this point, Richard Herrnstein (1969) stepped in. Herrnstein suggested
that it is unnecessary to infer unobservable entities like temporal cues or
even aversiveness or fear. Instead, one can understand avoidance learning
by merely recognizing that the response simply reduces the overall rate of
shock. Consistent with this idea, Herrnstein and Hineline (1966) found that
418  Chapter 10

rats would press a lever if the response caused shocks to occur at a lower
average rate. In their procedure, responding did not prevent the occurrence
of shocks; rather, it caused them to occur at quasi-random times at a lower
frequency than they did when the rats did not respond. A reduction in shock
rate is thus all that is necessary to maintain avoidance behavior; according
to Herrnstein (1969), speculations about fear and Pavlovian learning were
superfluous. This idea is still around (e.g., Baum, 2001; Hineline, 2001), al-
though it has been challenged in many places (e.g., Ayres, 1998; Dinsmoor,
2001). One problem is that it just does not explain many of the results in the
literature on avoidance learning, such as the acquired drive experiment—or,
in fact, any of the phenomena to be mentioned in the next several pages. At
least as important, it is not necessarily clear that claiming that an avoidance
response is learned because it reduces the shock rate does much more than
restate the original observation (that animals learn to avoid shock). Although
Herrnstein was being true to his radical behaviorist roots, many investiga-
tors would like an understanding of the psychological processes behind
the avoidance learning phenomenon. And even in the Herrnstein-Hineline
(1966) experiment, although the rats learned to perform responses that were
occasionally followed soon by a shock, on average those shocks still occurred
longer after the response than they did after any other event in the situation
(e.g., Ayres, 1998). If a CS can signal a reduction in the average frequency of
shock (e.g., as in negative contingency learning; see Chapter 3), why can’t an
avoidance response do the same thing? There are thus grounds for thinking
that the rats were still responding to reduce the level of fear.
Other research, however, began to show that the importance of warning
signal termination was not universal. Several investigators began investi-
gating the role of manipulating the contingencies in the avoidance experi-
ment (e.g., Kamin, 1957). One especially interesting set of experiments was
conducted in the laboratory of Robert C. Bolles (Bolles, Stokes, & Younger,
1966). The main results are summarized in Figure 10.5. In one experiment,
groups of rats were run in a shuttle box. For some of these groups, perform-
ing the response turned off the warning signal, but for the other groups,
it did not. For yet another portion of the groups, performing the response
avoided the shock, but for the other groups, it did not. (These two variables
were combined factorially, as shown in Figure 10.5.) The general pattern
seems clear. Turning off the warning signal made a difference; overall,
rats that terminated it made a response about 50% of the time, whereas
those that did not made a response about 25% of the time. Just being able
to avoid on its own also made a difference (overall, 55% to about 20%).
The important thing, though, is that a different pattern of results emerged
when rats were tested in a running wheel instead of the shuttle box. In this
condition, the rat was inside the wheel and had to run in it to make the
wheel spin a quarter turn to register a response. As Figure 10.5 shows, in
this case being able to turn the warning signal off again made a difference,
albeit a fairly small one. On the other hand, simply being able to avoid the
shock—regardless of whether the warning signal went off—made a huge
A Synthetic Perspective on Instrumental Action   419

Shuttle box Running Wheel


Terminate warning signal? Terminate warning signal?
Yes No Yes No

Yes 70 40 Yes 85 75

Avoid? Avoid?

No 31 9 No 38 26

Figure 10.5  The effect of turning off the warning signal depends on the avoid-
ance behavior being trained. Numbers in the cells indicate the percentage of
trials on which rats responded during a warning signal before an electric shock
occurred. Rats were trained in either a shuttle box or a running wheel. When
the response occurred, it terminated (turned off) the warning signal for some
rats but not for others. It also prevented (avoided) presentation of the shock for
some rats but not others. Two-factor theory did not predict that warning signal
termination would depend so much on the response. (Data from Bolles et al.,
1966; after Bolles, 1972a.)

difference. The results suggested that warning signal termination might


not have general reinforcing properties. Its role depends on what response
the animal is asked to make. Of course, the comparison of running in the
wheel and shuttling in the shuttle box confounded the two responses with
the actual apparatus that they were tested in, so the specific role of the
response was not completely clear.
Bolles (1969) therefore pursued the role of the response further in an-
other experiment. Here, all rats were tested in the same running wheel. To
avoid an electric shock (and terminate the warning signal), the rats needed
to run, turn around, or stand on their hind paws when the warning signal
sounded. As Figure 10.6 shows, although the probability of each of these
responses was nearly equal at the start of training, running was the easiest
behavior to learn. In contrast, standing never increased in strength; thus,
the success of avoidance learning depended on the response requirement.
Bolles (1972a) noted that this conclusion is actually consistent with the en-
tire literature on avoidance learning, which suggests that some avoidance
responses are almost trivially easy to learn, whereas others are exceedingly
difficult. For example, if the rat is required to jump out of a box where it is
shocked so as to avoid, learning happens in approximately one trial (the
rat needs only one or two shocks to learn to jump out of the box on the
next trial). One-way avoidance, in which the rat is always asked to run
from one side of a two-compartment box to the other side—in the same
direction—is also learned in one or two trials. The famous shuttle box is
considerably more difficult. Here learning does not get much better than
420  Chapter 10

Figure 10.6  Avoidance behaviors are not


equally learnable—another example of pre- 100
paredness. Different rats were allowed to run, 90
turn, or stand to avoid receiving an electric
80

Percentage avoidance
shock in the running wheel. Running is far easier
to learn than turning or standing in the wheel. 70
(After Bolles, 1969.) 60
50
40
30
Run
20 Turn
Stand
10

0 1 2 3 4 5 6 7 8
10-trial blocks

about 70% avoidance after 100 or so trials. However, the task that is truly
difficult is lever pressing. According to Bolles, after 100 trials, the average
rat only responds 10% to 30% of the time. (Bolles once suggested that the
best way to ensure one-trial avoidance learning in a Skinner box was to
leave the lid off the box.) It does seem odd that lever pressing—which is so
easy to learn when it is reinforced by food—seems so difficult to learn as an
avoidance response. On the other hand, we have seen this theme before. As
demonstrated by flavor aversion learning (e.g., Chapters 2 and 6), animals
seem “prepared” to learn some things and not others. The differences in
how readily some avoidance responses are learned over others may be an-
other example of preparedness (i.e., a biological “constraint” on learning).
Species-specific defense reactions
In a classic paper, Bolles (1970; see also Bolles, 1972a) put this argument in
its evolutionary context. He suggested that animals have evolved certain in-
nate behaviors that protect them from being preyed upon by other animals.
He called these behaviors species-specific defense reactions (SSDRs). As
noted in Chapter 2, it now seems clear that some behaviors must be innate if
the animal is to defend itself in the wild. For example, when a rat is attacked
by a predator like an owl, there is usually no second trial. If the rat does not
get away the first time, the owl irrevocably wins the contest. Thus, if the rat
is lucky enough to detect the owl before it strikes, it pays for the rat to have
a prepackaged defensive response ready so as to avoid being attacked. In
Robert C. Bolles
addition to making this point, Bolles proposed that an avoidance response is
easy to learn in the lab provided that it is similar to a natural SSDR. He also
proposed that two important SSDRs are flight (getting out of a dangerous
situation) and freezing (a common behavior in which an animal stops dead
in its tracks). As mentioned in Chapter 2, and as we will see in a moment,
freezing does reduce predation (Hirsch & Bolles, 1980).
A Synthetic Perspective on Instrumental Action   421

Bolles’ idea was consistent with the general picture on avoidance learn-
ing mentioned above. Jump-out and one-way avoidance are readily learned
because they are effective flight responses. Lever pressing is not; it is not
compatible with an SSDR, and it is therefore difficult to learn. Shuttling in
a shuttle box is somewhere in the middle. It is a kind of flight response, but
it is not truly flight because on the second trial the rat must return to the
other side of the box, where it had received shock during the previous trial.
Therefore, shuttling is not a true SSDR, and it requires a lot of feedback—or
reinforcement—to learn. Consistent with this idea, warning signal termina-
tion is much more important in the shuttle box than in the one-way situa-
tion or the running wheel (Bolles & Grossen, 1969). SSDR theory initially
had a rather ad hoc flavor to it, but additional research helped tighten it up.
For example, Grossen and Kelley (1972) soon discovered another SSDR in
the rat and confirmed the prediction that it was easy to learn as an avoid-
ance response. Specifically, they found that if a rat receives shock while it
is in a large enclosure (about 1 square meter), it will immediately run to
the wall. This wall-seeking behavior is called thigmotaxis, and it is quite
common in rodents; if you ever notice a mouse in your home or in a camp
shelter or a barn, you will usually find it near a wall, where it is presum-
ably protected on the wall side from attack. The bottom line, though, is
that Grossen and Kelley reasoned that thigmotaxis would be learned very
quickly as an avoidance response. If rats could avoid shock by jumping
onto a ledge around the perimeter of their enclosure, they learned to do so
very rapidly. In contrast, if a platform of the same area was positioned in
the center of the arena, the rats never learned to jump to it. Finally, if given
a choice between a center platform and a ledge around the perimeter, they
always chose the ledge. Thigmotaxis-like responding was indeed learned
rapidly, confirming a prediction of SSDR theory.
The theory was also advanced by additional research on freezing, one of
the rat’s most common SSDRs (e.g., Fanselow & Lester, 1988). One thing we
now know about freezing behavior is that it is truly innate and functional.
This was beautifully shown in an experiment by Hirsch and Bolles (1980),
who went about the state of Washington trapping deer mice and then breed-
ing them in the lab. (Bolles was based at the University of Washington in
Seattle.) The geography of Washington (Figure 10.7) is interesting because
the state is bisected by the Cascade Mountains, which run north-south; these
mountains create two distinct climates—one to the east and one to the west.
Prevailing winds from the Pacific Ocean bring moisture to the western side
of the mountains, but there is little moisture left for the eastern side. Be-
cause of that pattern, Hirsch and Bolles realized that subspecies that have
adapted to western and eastern Washington are preyed upon by different
species of predators. In the arid east, a natural predator of the mouse is the
gopher snake. In the lush west, there are no gopher snakes, but there is
another natural predator—the weasel. Hirsch and Bolles therefore trapped
mice in eastern and western Washington and then allowed them to breed,
in separate lines, for several generations in the lab. The subjects in their ex-
422  Chapter 10

Annual inches of rain


Less than 10
10 to15
15 to 20
20 to 25
25 to 30
30 to 40
40 to 60
60 to 80
80 to 100
100 to 140
140 to 180
More than 180

Figure 10.7  West of the Cascade Mountains, the state of Washington gets a
great deal of rain each year, whereas east of the mountains, the state gets very
little. Two subspecies of deer mice have adapted to these conditions (Pero-
myscus maniculatus austerus and P. m. gambeli, respectively). P. m. austerus
is preyed upon by weasels, whereas P. m. gambeli is preyed upon by gopher
snakes. Hirsch and Bolles (1980) found that lab-reared mice descended from
these lines naturally recognized their respective predators. (Map courtesy of
Dan Clayton.)

periment were two generations removed from the live-trapped mice—they


had never been outside the lab, and had never been exposed to a natural
predator. In the experiment proper, mice from each line were put into a large,
naturalistic arena with animals of different types. When a gopher snake
was in the arena, the eastern Washington mice froze substantially, but the
western Washington mice did not. In fact, eastern Washington mice died
less frequently than western mice. In contrast, when a weasel was in the
arena, the western Washington mice froze, but the eastern Washington mice
did not. Here, the western mice died less frequently than the eastern mice.
Mice did not freeze at all to the presence of a nonpredatory garter snake or
ground squirrel. The results made two important points. First, freezing has
a clear payoff. As SSDR theory assumes, it reduces predation. Second, there
appears to be an innate recognition system that allows mice from different
regions to recognize their natural predators.
Another important advance was made in an experiment in which Bolles
and Riley (1973) simply required rats to freeze to avoid shock. The obvious
prediction was that freezing would be learned very easily as an avoidance
response. Bolles and Riley used a Sidman avoidance procedure like the one
described above, in which rats were placed inside a small box and given a
brief electric shock at 5-second intervals (the “shock-shock interval”). If the
rat froze, the next shock did not occur for 15 seconds (the “response-shock
interval”). Once a rat starts freezing, it can freeze continuously for several
minutes. In the Bolles and Riley experiment, the next shock was continu-
A Synthetic Perspective on Instrumental Action   423

(A) Figure 10.8  Freezing is rapidly


learned as an avoidance response,
100 but it turns out to be a respondent
rather than an operant. (A) Rats
Avoid quickly learned to freeze to avoid
80 receiving shock. When shocks were
Percentage freezing

Punish then used to punish freezing, freezing


decreased immediately, but never
60
declined further. (B) Regardless of the
group, freezing followed the same
40 time course after each shock. A shock
elicits an initial burst of activity (mak-
ing freezing zero), but the rat soon
20 begins to freeze. If left undisturbed,
the rat will freeze for 15 minutes or
so and then eventually stop, which
0 1 2 3 4 5 6 7 explains why rats learned to freeze so
5-minute intervals quickly after they were first shocked.
However, when shock was used to
(B)
punish freezing, each shock first elicit-
ed the activity burst (making freezing
100 go to zero) and then freezing again.
Percentage

Once freezing resumed, another


freezing

50 shock was delivered, and the same


activity-then-freeze cycle began all
0 over again, causing the rat to freeze
0 10 s 15 min about 60% of the time. (After Bolles,
Time since shock
1975; Bolles & Riley, 1973.)

ously postponed until freezing stopped. The results are shown in Figure
10.8A. Consistent with the theory, the rats learned to freeze immediately.
The experiment actually included several other groups, one of which
is included in Figure 10.8A. In this group, once the rats were freezing,
the contingency between freezing and shocking was reversed so that now
the shock was programmed to punish freezing. Specifically, the rats now
received shocks at 5-second intervals once they froze for 15 seconds. Inter-
estingly, as the figure shows, once the new punishment contingency began,
freezing decreased abruptly, and the rats began freezing about 60% of the
time. Despite delivery of the shock, though, freezing never went down to
zero. The fact that it decreased so quickly and then never decreased any
further made Bolles and Riley (1973) suspicious about what the rats were
learning. They therefore took a closer look at the results and found that,
no matter what, the rats always showed a characteristic pattern or cycle
of responding after receiving each shock. This result is shown in Figure
10.8B. When a shock occurred, it immediately elicited a burst of activity
for a few seconds (Fanselow, 1982), and then the animal began to freeze;
it then froze for a long period of time. Remarkably, that is all you need
424  Chapter 10

to know to understand the results shown in Figure 10.8A. For rats in the
avoidance group, the first shock caused a brief burst of activity; then, freez-
ing proceeded for several minutes—with nearly 100% freezing right away.
On the other hand, when freezing was punished, the rats began cycling
through the same pattern in another way. As usual, the first shock elicited
a burst of activity for a few seconds, and then the rats froze again. But once
it accumulated 15 more seconds of freezing, another shock was delivered,
eliciting another activity burst and then more freezing. Each shock deliv-
ered after freezing broke up the freeze for a few seconds and then caused
freezing to occur again. What this means is that all the freezing measured
in the Bolles and Riley experiment was entirely elicited. It was controlled
by its antecedents, and not by its consequences. If you remember Skinner’s
fundamental distinction between operants and respondents (see Chapter
1), freezing turns out to be a respondent (controlled by its antecedents) and
not an operant (controlled by its consequences). This classic SSDR—one
of the easiest of all responses to learn to avoid shock—does not seem to be
an operant behavior at all.
This point needs a little further amplification. In the Bolles and Riley
(1973) experiment, shocking the rat inside the box actually allowed rapid
Pavlovian learning to occur. That is, the rat quickly associated the shock
with the box. It turns out that the box CS, and not the shock itself, is what
probably elicited all the freezing. (The shock itself mainly elicits the activ-
ity burst.) If a rat is given an electric shock and then removed and put into
a different box, it does not freeze; the rat must be returned to the box in
which it was just shocked (Blanchard & Blanchard, 1969; see also Fanselow,
1980). Freezing occurs in anticipation of shock; therefore, the rats froze in
the Bolles and Riley experiment because of Pavlovian learning. The rats
quickly learned that the box was associated with an electric shock and
froze there until (1) another shock occurred and interrupted the freeze or
(2) some extinction occurred after an extended period of further exposure
to the box, without another shock. This avoidance learning experiment was
actually a Pavlovian learning experiment in disguise.
There are two things to remember from all this information. First, the
animal’s biology—its evolutionary history—matters. The success of avoid-
ance learning depends at least partly on the degree to which the response
required is a natural defensive response. Second, there is a lot of Pavlovian
learning in avoidance learning. This point was originally made by two-factor
theory. According to SSDR theory and the results of some of the research it
stimulated, however, a great amount of behavior is controlled by learning
about cues in the environment. In the freezing experiment, the rats mainly
learned that the box was dangerous, and a natural behavior was addressed
to the box. In a jump-out or one-way avoidance experiment, the situation is
similar. According to SSDR theory in its later forms (e.g., Bolles, 1978), the
rat simply learns through Pavlovian learning that one place is dangerous (a
fear excitor) and the other place is safe (a fear inhibitor), and its natural flight
behavior takes care of the rest. Natural defensive behaviors are supported by
A Synthetic Perspective on Instrumental Action   425

what the animal learns about cues in the environment. SSDR theory was one
of the earliest “behavior systems” theories that we discussed in Chapter 5.
Bolles went on to extend this idea even further with his student Michael
Fanselow (e.g., Bolles & Fanselow, 1980). Working in Bolles’ laboratory,
Fanselow discovered that CSs associated with an electric shock come to
elicit endorphins—natural substances that are released in the brain that
deaden pain (e.g., Fanselow, 1979). Thus, in addition to eliciting an SSDR,
a CS associated with shock also elicits a natural painkiller. The argument
Michael S.
was that this analgesic response served to get the animal ready for—and Fanselow
thus adapt to—the upcoming shock. (Remember that Pavlovian signals
evoke whole constellations of behavior that serve to optimize the animal’s
interactions with upcoming events, as discussed in Chapters 2 and 5.)
According to Bolles and Fanselow (1980), one function of this analgesia is
to suppress recuperative behaviors—that is, behaviors elicited by tissue
damage (like licking a wound) that function to promote healing. Stopping
to lick a wound while fending off a predator would not make sense because
such an action would interfere with defense; thus, analgesia elicited by
cues for attack suppresses recuperative behaviors. The model envisioned
by Bolles and Fanselow is illustrated in Figure 10.9. Both danger signals
(Pavlovian excitors for shock) and the sight, smell, or sound of an actual
predator arouse the motivational state of fear, which in turn potentiates
SSDRs. We have already seen that a natural predator evokes freezing in
rodents; Lester and Fanselow (1985) likewise demonstrated that the pres-

Stimulus Motivational state Behavior

Freeze
Predator Fear SSDRs
Flee
Bury
Danger signal (CS)
Etc.
Other

Wound lick
Tissue damage Pain Recuperative
behavior Stay at home
Etc.

Other

Figure 10.9  The perceptual-defensive-recuperative model of fear and pain. The


presence of a predator or a danger signal (CS) arouses fear, which then potentiates
SSDRs (which are further selected by support stimuli in the environment). Fear also
inhibits other motivational states or systems, including pain. Pain itself is aroused
by tissue damage and selects another class of behaviors—recuperative behaviors.
Notice how the Fear and Pain systems are functionally organized to allow an ani-
mal to defend itself and recover from attack. (After Bolles & Fanselow, 1980.)
426  Chapter 10

ence of a potential predator will also evoke an endorphin response. Which


SSDR actually emerges in the situation depends on support stimuli in the
environment. If the situation supports flight, the rat will flee; if the situation
does not support flight, the rat might freeze. If the rat is shocked when it
touches a small prod and sawdust or some other material is available, the
rat will bury the prod (e.g., Pinel & Mana, 1989). Fear also inhibits pain,
a motivational state evoked by tissue damage. That state, like fear itself,
otherwise potentiates recuperative behaviors designed, essentially, to now
promote defense of the body after an attack.
Fanselow and Lester (1988; see also Fanselow, 1994) extended this
model even further with the predatory imminence theory I introduced
you to in Chapter 5. You may remember that different defensive behav-
iors are now thought to engage at “preencounter,” “postencounter,” and
“circa-strike” points along the imminence continuum. Each behavior is
designed to prevent the animal from going to the next point further along
the continuum (i.e., moving closer to the predator). In effect, the recu-
perative behaviors illustrated in Figure 10.9 are behaviors that are then
evoked in a kind of poststrike phase. In perhaps a further extension of
this approach, we have also seen that long-duration cues that might signal
trauma at a temporal distance are thought to evoke anxiety, whereas those
temporally closer to trauma evoke fear and panic (Bouton, 2005; Bouton,
Mineka, & Barlow, 2001). These functional analyses of fear, anxiety, and
defense after trauma are the intellectual legacy of the SSDR theory of
avoidance learning.
Cognitive factors in avoidance learning
One direction that theories of avoidance learning thus began to take was
very ethological. There has been an emphasis on natural defensive be-
haviors rather than new behaviors learned through reinforcement. What,
though, can all these innate behaviors in rats and mice really tell us about
avoidance behavior in humans? Interestingly, there is evidence that humans
perform SSDRs, too. For example, when people are exposed to social threat
cues (pictures of angry as opposed to neutral or happy faces), they reduce
their natural body movements and decrease their heart rate in a manner
consistent with freezing (Roelofs, Hagenaars, & Stins, 2010; for a review, see
Hagenaars, Oitzl, & Roelofs, 2014). And people with high levels of anxiety
demonstrate thigmotaxis: They stay near the edges when they explore an
empty soccer field, and they walk near the edge of a town square (or avoid
the square altogether) when they are asked to walk to a destination on the
other side of it in a small city (Walz, Mühlberger, & Pauli, 2016).
A larger point is that the emphasis in avoidance learning shifted from
operant reinforcement mechanisms to Pavlovian mechanisms; Pavlovian
cues associated with pain or emotional trauma in humans will evoke
behaviors that have evolved as natural defensive responses. Experience
with a trauma will almost certainly engage Pavlovian processes, and
A Synthetic Perspective on Instrumental Action   427

these processes may go a long way in explaining anxiety disorders. For


example, as noted in Chapter 5, in panic disorder, associating anxiety
or panic with the local shopping mall might be sufficient to keep the
patient away from the mall. In Chapter 2, we discussed sign tracking:
We approach CSs for positive events and withdraw from CSs for nega-
tive events. This might explain quite a lot of avoidance behavior. And
if we want to treat avoidance behavior therapeutically, we will need to
extinguish the Pavlovian fear.
Although there is an explicit biological emphasis in SSDR-type theo-
ries of avoidance learning, the approach is also cognitive in the sense that
it emphasizes that organisms learn about danger and safety cues (e.g.,
Bolles, 1978). Instead of emphasizing S-R learning, though, the approach
notes that the behavioral effects of fear (freezing, fleeing, burying, etc.)
can vary and are supported by other stimuli in the environment. Loosely
speaking, the rat recognizes the CS as a danger cue, but the response the
CS evokes depends on other support stimuli (see Chapter 5). Further-
more, for organisms to learn about cues in the environment (that then
engage the defensive behavior system), the facts and theories of classical
conditioning that we have discussed in detail in earlier chapters now
necessarily apply. Remember that the empirical facts of Pavlovian learn-
ing caused most theorists to invoke processes that are not immediately
apparent in behavior (like short-term memory, attention, and priming) to
explain them. Thus, modern avoidance theory now envisions an organ-
ism not only performing natural defensive behaviors but also learning
about cues in its environment through the Pavlovian processes defined
and evaluated in Chapter 4.
But we must also understand how organisms learn avoidance be-
haviors that are not, in fact, SSDRs. Although such learning takes more
training, it is still possible for many rats to learn to shuttle in shuttle
boxes, and (with more difficulty) they can even be taught to press a lever.
We now know that there can be a kind of trade-off between defensive
reactions like freezing and avoidance responding in these situations.
For example, rats in shuttle avoidance experiments freeze at high levels
early in training, but freezing declines as training continues and the
animals learn to shuttle and avoid (e.g., Choi, Cain, & LeDoux, 2010;
Lázaro-Muñoz, LeDoux, & Cain, 2010; Moscarello & LeDoux, 2013). In-
terestingly, rats that become successful avoiders freeze less than poor
avoiders, and the poor avoiders become better at avoiding if their freez-
ing is reduced by lesioning a part of the brain that influences freezing
(the central nucleus of the amygdala; Choi et al., 2010; Lázaro-Muñoz et
al., 2010). One idea is that an important function of successful avoidance
learning is to shut down or inhibit the kind of natural fear responses
(SSDRs) that normally occur in frightening situations (LeDoux, 2012,
2015; Moscarello & LeDoux, 2013).
428  Chapter 10

So, how do successful avoiders do it? One insight is that in cases like
shuttle and lever-press avoidance, in contrast to cases in which the task
requires a simple SSDR, the warning signal termination contingency is
important. That is, turning the warning signal off once the response is
made is necessary to achieve good learning (e.g., Bolles & Grossen, 1969;
Bolles et al., 1966). Therefore, one possibility is that reinforcement through
fear reduction—that is, escape from learned fear—is still important when
the required behavior is not an SSDR. A somewhat different idea is that
warning signal termination provides the animal in these situations with
“feedback” for the correct response (Bolles, 1970).
The idea that CS termination provides feedback was consistent with
experiments by Bolles and Grossen (1969), which showed that in situa-
tions in which warning signal termination is important, leaving the warn-
ing signal on and merely adding a brief cue at the moment the response
was made were equally effective at improving learning. Feedback cues
can also guide choice; when experimenters have managed to train rats
to make two lever-press avoidance responses, the rats choose to make
the response that yields a feedback stimulus (Fernando, Urcelay, Mar,
Dickinson, & Robbins, 2014). The feedback cue might work because it
has become a conditioned fear inhibitor. Notice that if a warning signal
is paired with an electric shock on its own—but not when it is accompa-
nied by a response (and the feedback stimulus)—the feedback stimulus
and the response are essentially negative features in a feature-negative
discrimination (see Chapters 4 and 5). Morris (1974) and Rescorla (1968a)
showed that feedback stimuli used in avoidance training do, in fact, be-
come conditioned inhibitors (see also Fernando et al., 2014). Furthermore,
if animals are first given Pavlovian training that establishes the feedback
stimulus as a fear inhibitor, this training increases the strength of an
avoidance response when it is presented as a consequence of the response
(Rescorla, 1969a; Weisman & Litner, 1969). Thus, feedback stimuli may
reinforce avoidance because they inhibit fear.
Given that any response probably produces some natural stimulus
feedback, perhaps the avoidance response itself becomes a conditioned
inhibitor, acquiring a kind of negative association with the shock. The idea
that behaviors can function as safety signals is consistent with observations
made by clinical psychologists who have noticed that people with anxiety
disorders often acquire certain “safety behaviors” (e.g., Salkovkis, Clark, &
Gelder, 1996). Such behaviors, such as carrying an empty pill bottle around
everywhere for comfort, are basically avoidance behaviors that reduce fear
(e.g., Bouton et al., 2001). Safety behaviors are arguably problematic for
therapy because their presence may maintain fear of the warning signal.
Carrying the pill bottle might make a person with agoraphobia feel safe
leaving the house—but only as long as the pill bottle is present. As we saw
in Chapter 4, the presence of an inhibitor in compound with the CS will
“protect” the CS from extinction (e.g., Rescorla, 2003). So, thinking of the
A Synthetic Perspective on Instrumental Action   429

response and its consequences as the equivalent of a Pavlovian cue has


further predictive and explanatory power.
In another cognitive approach to avoidance, Seligman and Johnston
(1973) suggested that organisms learn that the avoidance response pre-
dicts no aversive US; in addition, not performing the response predicts
the US. These expectancies lead the organism to avoid. One problem
with this approach is that it seems to imply that avoiders are awfully
cool and rational. In contrast, we know that emotion is also present in
avoidance learning; the warning signal can clearly elicit fear, at least
early in training (e.g., Kamin et al., 1963; Starr & Mineka, 1977). When
humans are run in experiments in which they learn to press a button in
the presence of a cue to avoid a mild shock delivered to the wrist, they do
report expectancies that the warning stimulus will lead to shock as well
as that the response will lead to no shock (e.g., De Houwer, Crombez, &
Baeyens, 2005; Lovibond, Chen, Mitchell, & Weidemann, 2013). But they
also show an anxious electrodermal response (palm sweating) during the
warning signal and less of it when they can make the avoidance response
(e.g., Lovibond et al., 2013). Lovibond (2006) proposed an extension of the
cognitive theory that helps integrate cognition with behavior and emo-
tion. He suggested that individuals learn that the warning signal predicts
shock and that the expectancy of shock arouses fear or anxiety. They also
learn that responding leads to the omission of shock. The combination
of these expectancies leads to avoidance in the presence of the warning
signal, and the expectancy of no shock caused by the response makes
them less afraid when they do avoid.
This kind of approach implies that avoidance behavior can become very
persistent because if shock is never actually associated with the response,
there is never anything to disconfirm the expectancy of “response-no
shock.” One problem is that avoidance responding does not persist forever.
That is, responses do eventually decline if they are allowed to occur over
and over without shock (Mackintosh, 1974), even though each response
would presumably further confirm the response–no shock expectancy. An-
other possible problem is that humans continue to avoid even in the face
of knowledge that the response predicts shock. In an experiment by De
Houwer et al. (2005), humans learned to avoid a mild shock by pressing
different computer keys. One response (R1) avoided shock in the presence
of one warning signal (A; e.g., a square presented on the computer screen),
and another response (R2) avoided shock in the presence of a second signal
(B; e.g., a triangle). After the responses were learned, a second phase began.
The avoidance trials with R1 and R2 continued. On separate trials, however,
the participants were also instructed to make R2 (without its warning sig-
nal); this time, each response made a shock occur. Although it caused the
participants to report that they expected shock whenever they made the
response, in a final phase they also reported that they expected “no shock”
when the response was tested again in the presence of the original warning
430  Chapter 10

100
Mean shock expectancy rating

75

50

25

0
A A/R1 R1 B B/R2 R2
Test condition

Figure 10.10  Shock expectancy ratings by humans at the end of an avoidance


experiment. The participants made different responses (R1 and R2) by pressing
different keys on a keyboard. R1 avoided shock in the presence of CS A, whereas
R2 avoided shock in the presence of a CS B. After initial training, the participants
were also occasionally instructed to perform R2, which led to presentation of a
brief electric shock. This treatment caused the participants to expect shock after
R2 (far right), but R2 still inhibited the shock expectancy when it was combined
with CS B (B/R2). The avoidance response is similar to that of a Pavlovian occa-
sion setter. (After De Houwer et al., 2005.)

stimulus (Figure 10.10)! Minimally, the cognitive theory would need to


assume that the participants had learned that their “response–no shock”
expectancy was true only during the actual warning stimulus. A problem
for this idea is that the response–no shock expectancy was not specific to its
original stimulus; De Houwer et al. (2005) showed that it was also present
when it was tested in a third warning stimulus, C (which, for simplicity, is
not shown in Figure 10.10; see also Declercq & De Houwer, 2011).
De Houwer et al. (2005) argued that the results were what you would
expect if the response had become a negative occasion setter informing the
subject that the shock would not follow the warning signal (see Chapter
5). During training, the warning signal was paired with an electric shock
unless the response was made; the response then set the occasion for a
signal–no shock relation. Remember that the power of an occasion setter
is not affected by its direct association with the US (see Chapter 5). Thus,
associating the response with shock in the second phase had no effect on
its ability to set the occasion for the signal–no shock relationship.
The idea that an avoidance response may be a negative occasion setter
is a variation of the idea that the response is analogous to a conditioned
inhibitor—negative occasion setting is a form of inhibition. In the long
run, the evidence currently favors the idea that we learn about avoidance
responses in a way that is similar to how we learn about CSs in similar
A Synthetic Perspective on Instrumental Action   431

arrangements. Response learning seems to follow the rules of Pavlovian


learning (see also Lovibond et al., 2013).
Learned helplessness
Another line of research also led to a cognitive perspective on avoidance
behavior. Overmier and Seligman (1967) found that avoidance learning
proceeded very slowly if their dog subjects were first given inescapable
shocks before the avoidance training (see Overmier & Leaf, 1965, for ear-
lier observations). Seligman and Maier (1967) then showed that it was the
uncontrollability of the initial shocks that was crucial. Their experiment is
summarized in Figure 10.11. They used three groups in what has become
known as the “triadic design.” In the first phase, one group was given a
series of electric shocks that could be escaped (turned off once they had
started) if the dog pressed a panel with its nose. A second group was
“yoked” to the first; each animal received exactly the same shocks at the
same points in time as a partner animal in the first group, but the behavior
Steven Maier
of the second group of dogs did not influence when the shock went on or
off. A third group of dogs received no shocks at all in the first phase. In the
second phase, all groups could escape shock in a shuttle box by shuttling
from one side to another. As shown in Figure 10.11, the groups differed in
how well they performed in this phase. The groups given either no shock
or escapable shock escaped most of the shocks, but the group first given in-
escapable shock was very poor at learning to escape. Six out of ten subjects
never responded during the shock. Exposure to inescapable shock—but not
escapable shock—caused a profound deficit in subsequent escape learning.
Seligman and Maier (1967) suggested that exposure to inescapable shock
J. Bruce Overmier
actually had several effects. Dogs exposed to inescapable shock did not initi-
ate responding in the second phase; they had what appeared to be a “mo-
tivational deficit.” They also failed to show as much learning in the second
phase; they had what was termed a “cognitive deficit.” And they seemed
to accept the shocks delivered in the second phase passively—that is, they
also seemed to have an “affective” or “emotional deficit.” In another experi-
ment published in the same paper, Seligman and Maier also demonstrated

Phase 2
Group Phase 1 Phase 2 Shocks escaped (%)
Escapable Escapable shock Escape training 74
Inescapable Yoked, inescapable shock Escape training 28
No shock ——— Escape training 78

Figure 10.11  The learned helplessness effect. In Phase 1, different groups


receive escapable electric shocks, inescapable shocks, or no shock. In Phase 2,
when allowed to escape or avoid shock in a different environment, the group ex-
posed to inescapable shocks does poorly. (Data from Seligman & Maier, 1967.)
432  Chapter 10

that exposure to escapable shocks before exposure to inescapable shocks


protected the animal from these deficits. This protection is called the im-
munization effect (see also Maier & Watkins, 2010; Seligman, Rosellini,
& Kozak, 1975; Williams & Maier, 1977). The idea that emerged was that
organisms do more than merely learn to respond in instrumental learning
experiments; they may acquire knowledge about the relationship between
their behavior and its outcomes. When exposed to inescapable shock, the
dogs behaved as if they had acquired the belief that their actions and shock
termination were independent, and this belief seemed to generalize to a new
action in a new situation in the second phase (Maier & Seligman, 1976; Maier,
Seligman, & Solomon, 1969; Seligman, Maier, & Solomon, 1971). This view is
often referred to as the learned helplessness hypothesis. The phenomenon
itself—the finding that inescapable shock interferes with subsequent escape
learning—is often called the learned helplessness effect.
Notice that, when a shock is inescapable, shock termination is equally
probable whether the animal responds or not. In this sense, there is no con-
tingency between the action and the outcome. There is a connection here
with what we have previously seen in Pavlovian learning. Remember that
Rescorla (e.g., 1967b, 1968a; see Chapter 3) discovered that the contingency
between a CS and a US determines what the animal learns about the CS:
When the probability of a US is greater in the CS than outside it, the CS
becomes an excitor; when the probability of the US is less in the CS than
outside it, the CS becomes an inhibitor; and when the probability of the US
is the same regardless of whether the CS is on or off, the CS becomes nei-
ther. Rescorla never claimed that the animals learned about the contingency
itself; the contingency between CS and US simply predicts whether the CS
will be an excitor or an inhibitor. (This agnostic view was accepted by the
theories of conditioning, covered in Chapter 4, that were developed later.)
On the other hand, other theorists have come closer to claiming that ani-
mals do learn about the contingency. Mackintosh (e.g., 1973; see also Baker,
1974; Baker & Mackintosh, 1977) found that when there is zero contingency
between a CS and a US, it is harder to learn to associate the two events
when they are later paired (but see Bonardi & Hall, 1996; Bonardi & Ong,
2003; Killcross & Dickinson, 1996). This effect is called learned irrelevance.
Although Mackintosh argued that zero contingency between a tone and
a shock might reduce attention to—or associability of—the tone, there is
also evidence that a more “generalized learned irrelevance” can develop.
That is, exposure to a zero contingency between a tone and a shock can also
make it difficult to learn that a different CS, like a light, is associated with
shock (Dess & Overmier, 1989; Linden, Savage, & Overmier, 1997). Notice
the parallel with the learned helplessness effect. When exposed either to an
uncorrelated CS and US or an uncorrelated action and outcome, organisms
may learn a kind of “belief” that similar events will also be independent.
The learned helplessness hypothesis became generally well known to
psychologists. Maier and Seligman (1976) wrote an influential paper that
was read by many experimental psychologists, and Seligman (1975) wrote
A Synthetic Perspective on Instrumental Action   433

a book for a general audience that linked the phenomenon to depression


in humans (the book was entitled Helplessness: On Depression, Development,
and Death). The idea was that uncontrollable events like the loss of a job
or death of a loved one may lead a person to the kind of motivational,
cognitive, and emotional deficits seen in the original animal experiments.
Thus, a depressed person may acquire a belief that nothing that he or
she does makes a difference or matters, and this belief leads to a kind of
giving up about life. The theory has moved forward since then (e.g., see
Overmier & LoLordo, 1998, for one review). For example, Seligman and
his colleagues (e.g., Abramson, Seligman, & Teasdale, 1978) went on to
emphasize causal attributions people make about the events in their lives;
helplessness and depression mainly develop when uncontrollability is at-
tributed to the person’s own failures (rather than an external cause), is as-
sumed to be persistent, and is assumed to be a global characteristic of life.
Seligman (e.g., 1990) went on to emphasize the positive effect of learning
“optimism,” which results from attributions that outcomes are globally
and persistently affected by one’s behavior.
Research in animal laboratories went in a somewhat different direction.
Investigators quickly realized that several different factors can contribute
to the escape deficit reported by Seligman and Maier (1967; for a review.
see LoLordo & Taylor, 2001). In the long run, none of these factors replaced
the idea that animals can learn about the relation between behavior and
outcome. They nonetheless need to be summarized here because they re-
mind us that behavioral effects rarely have single causes. First, exposure
to inescapable shock can simply make an animal less active (e.g., Anisman,
DeCatanzaro, & Remington, 1978; Glazer & Weiss, 1976). Although sup-
pressed activity could make it difficult to learn an active escape response,
this cannot be the whole story. Maier (1970) compared active escape learn-
ing in groups that first had inescapable shocks or escape training in which
they had to stand motionless to escape. Even though standing still was
especially incompatible with the active escape response, subjects in the
escape group still learned the active response better than subjects that had
been inescapably shocked! Inescapable shock can also interfere with escape
in a maze in which the animal must choose to go left or right (Jackson, Al-
exander, & Maier, 1980; Lee & Maier, 1988; Minor, Jackson, & Maier, 1984).
Suppressed activity should make responding slow, but not make choices
inaccurate. The results therefore further suggest that suppressed activity
cannot completely explain the learned helplessness effect.
A second possibility is that inescapable shock numbs the animal and
makes it feel less pain. Analgesia does result from inescapable shock ex-
posure (e.g., Jackson, Maier, & Coon, 1979; Maier, Sherman, Lewis, Ter-
man, & Liebeskind, 1983), but it is not necessary for helplessness. Certain
physiological manipulations that eliminate analgesia do not eliminate the
helplessness effect (MacLennan et al., 1982). Third, exposure to inescap-
able shock can reduce the brain level of certain neurotransmitters, such
as norepinephrine and serotonin (e.g., Weiss et al., 1981). But how these
434  Chapter 10

effects play out over time after the receipt of inescapable shock does not
always match the time course of the helplessness effect—and, equally im-
portant, it is not clear whether they interfere directly with escape learning
or merely reflect the animal’s expectation of act-outcome independence
(e.g., Overmier & LoLordo, 1998). A good deal of excellent research has
further studied the brain systems that accompany learned helplessness
and the immunization effect (e.g., Maier & Watkins, 2005, 2010). Exposure
to inescapable shock has a large number of effects, but animals clearly
do learn something other than a competing response that interferes with
escape learning in the second phase.
But what is the nature of the cognitive deficit? One possibility is that
inescapable shock somehow causes the animal to pay less attention to its
behavior, just as the animal may pay less attention to a CS that is not cor-
related with a US (Minor et al., 1984). Consistent with this idea, if one adds
a feedback stimulus during escape training in the second phase so that
the new response terminates shock and also causes an attention-grabbing
termination of the lights, inescapably shocked rats begin to learn as well
as unshocked rats (Maier, Jackson, & Tomie, 1987). Unfortunately, other
evidence is not so clearly consistent with an attentional interpretation. After
finally learning the response in a second phase, inescapably shocked rats
will extinguish responding quickly (Testa, Juraska, & Maier, 1974), or they
can be quick to learn that behavior and outcome are uncorrelated again
(Rosellini, DeCola, Plonsky, Warren, & Stilman, 1984). Neither result seems
consistent with the idea that “helpless” animals do not pay attention to
their behavior (LoLordo & Taylor, 2001).
One of the most important take-home messages of the research that
followed discovery of the learned helplessness effect is that controllability
of a bad event has a powerful influence on the stress it produces (Minor,
Dess, & Overmier, 1991). For example, exposure to uncontrollable (and
unpredictable) shocks is stressful enough that it can cause rats to develop
ulcers; the level of ulceration is reduced, though, if the same shocks are
controllable (and predictable) (Seligman, 1968; Weiss, 1968). Exposure to
a briefer series of uncontrollable or controllable shocks can also increase
or decrease (respectively) the extent to which other stressful events lead to
ulceration (Overmier & Murison, 2013). Uncontrollable shock also causes
more fear conditioning than controllable shock when the two are compared
as USs in fear conditioning (e.g., Mineka, Cook, & Miller, 1984; Rosellini,
DeCola, & Warren, 1986). In the long run, the controllability of events turns
out to be an important dimension that influences their psychological effect
(for further evidence, see Mineka & Hendersen, 1985; Mineka, Gunnar, &
Champoux, 1986).
Maybe learning about the controllability of a stressor, rather than its
uncontrollability, is the really important factor. But why does being able to
escape a shock reduce its stressfulness? It seems to rely on the predictive
value of the escape response. As illustrated in Figure 10.12, a response that
turns a shock off might predict both shock offset and the upcoming intertrial
A Synthetic Perspective on Instrumental Action   435

Shock

Escape response

Cessation CS

Backward CS

Time

Figure 10.12  Temporal relationships between a shock, an escape response that


turns it off, a cessation CS, and a backward CS. The escape response is initiated
just before the electric shock ends and then persists a bit afterward. A cessation
CS precedes the offset of shock, whereas a backward CS follows it. A cessation
CS is especially good at reducing the effects of inescapable shock—it has effects
like an escape response. (After Minor et al., 1990.)

interval that is free from shock. Investigators have therefore asked whether a
CS presented either immediately before shock termination (a shock-cessation
cue) or immediately after the shock (a “backward” cue; see Chapters 3 and
4) similarly softens the effect of inescapable shock. Adding a backward CS
just after each inescapable shock does reduce the fear conditioning the shock
causes (e.g., Mineka et al., 1984). However, a backward CS is less effective
than an escape response at reducing the effect of the shock when there are
short intertrial intervals between shocks (e.g., Minor, Trauner, Lee, & Dess,
1990; Rosellini et al., 1986; Rosellini, Warren, & DeCola, 1987). The backward
CS—but not the escape response—seems to work only when it signals a fairly
long minimum period free from shock (Moscovitch & LoLordo, 1968). A
cessation cue has a bigger effect. It softens the impact of inescapable shock,
regardless of whether the next intertrial interval is short or long (Minor
et al., 1990). It also produces an immunization effect—that is, exposure to
inescapable shocks with an added cessation cue protects the subject from
helplessness caused by future inescapable shocks (Minor et al., 1990). A com-
bination of cessation and backward signals seems especially effective. To
conclude, escapable shocks might be less harmful than inescapable shocks
because the escape response signals shock cessation. This might weaken the
aversiveness of shock through a process like counterconditioning (for further
discussion, see LoLordo & Taylor, 2001); association of the shock with the
positive properties of the response might make the shock less aversive. It is
worth noting that the effects of safety signals may be mediated by different
neural mechanisms (e.g., Christianson et al., 2008) But an important feature
of an instrumental action may be its signaling properties.
436  Chapter 10

Summary: What does it all mean?


In addition to its practical relevance in helping us to understand stress and
depression, research on learned helplessness has had a broad influence on
theories of learning. First, from its earliest days, the helplessness phenom-
enon suggested a parallel between how animals learn in instrumental and
Pavlovian situations. That is, the apparent effects of action-outcome indepen-
dence paralleled the effects of CS-US independence in Pavlovian learning.
(It is interesting to note that Rescorla’s work on contingencies and Overmier,
Maier, and Seligman’s work on helplessness were all initially conducted
while these investigators were graduate students in Richard Solomon’s labo-
ratory.) Second, research on helplessness began to reinforce the idea that the
response has signaling attributes—escapable shock may be less detrimental
than inescapable shock because the response is analogous to a Pavlovian
cessation cue. Finally, the phenomenon led to a more cognitive perspective
on instrumental learning. Animals may learn more than merely performing
a certain response in a certain situation. Their knowledge about the relation-
ship between their actions and outcomes also may count.
More generally, research on avoidance learning has highlighted the
influence of Pavlovian learning on instrumental performance ever since
Mowrer’s two-factor theory was first proposed. Today, in addition to mo-
tivating and providing a possible basis for reinforcement, Pavlovian fear
is thought to arouse a system of natural behaviors that gets the system
ready for the aversive outcome (see also Chapters 2 and 5). Furthermore,
the emphasis on feedback stimuli and inhibition in explaining avoidance
behavior suggests that behavior is like a safety signal or negative occasion
setter. Avoidance behavior is strongly influenced by S-O learning, and the
response has properties analogous to those of a CS.

Parallels in Appetitive Learning


Of course, another implication of research on avoidance learning was that
not all behaviors are equally learnable—that is, biology and evolution also
have an influence. A surprising amount of the learning is about stimuli in
the environment; animals learn about cues that predict danger and safety,
and organisms then direct their natural behavior toward these CSs. One of
the interesting things about this aspect of avoidance learning is that it has
an almost perfect parallel in learning about appetitive goals and outcomes,
such as food.
The misbehavior of organisms
In a classic paper that was mentioned briefly in Chapter 2, Breland and
Breland (1961) described many attempts to train animals to perform dif-
ferent behaviors by applying the principles of reinforcement. (You may
remember that the Brelands had a business in which they trained animals
to perform, for example, on television commercials.) Unfortunately, rein-
forcement did not always work; the paper was titled “The Misbehavior
A Synthetic Perspective on Instrumental Action   437

of Organisms,” a clever play on the title of Skinner’s famous book, The


Behavior of Organisms (1938). You may remember that the Brelands tried to
reinforce a pig for putting wooden coins into a piggy bank. As the Brelands
patiently shaped the pig to take the coins and deposit them, the pig began
dropping them and rooting them around the pen with its snout as if the
coins were bits of food. The behavior persisted despite never being rein-
forced. The Brelands also tried to train a raccoon to do the same thing—to
put a coin into a piggy bank. This animal learned to drop a coin into a con-
tainer when reinforced to do so, but when the Brelands pressed their luck
and tried to shape the raccoon to deposit two coins rather than one, things
began to fall apart. At this point, the raccoon took the coins and, instead of
depositing them in the container, rubbed them together and dipped them
in and out. Again, this behavior persisted despite no reinforcement. Other
reinforcement failures involved chickens. At one point, the Brelands tried
to train one or two chickens to stand on a platform for 15 seconds to receive
food. The chickens would not quite do that; instead of standing still dur-
ing the interval, they scratched the floor persistently with their feet. (The
Brelands took advantage of this behavior and instead taught the chickens
to dance for 15 seconds to the sound of a jukebox.) Other chickens were
taught to play baseball in a miniature baseball park. Pulling a loop made
a small bat swing and hit a ball that went rolling out into the outfield. If
the bird then ran to first base, it was reinforced. Things were fine as long
as the bird was confined to a cage along the first-base line. But if the bird
was let loose and had access to the ballpark, it would chase the ball and
peck it all around the outfield.
Other examples of misbehavior have been reported (e.g., Timberlake,
Wahl, & King, 1982). The Brelands emphasized that each case represented
a failure of reinforcement and noted that each involved the intrusion of an
instinctive behavior that got in the way (they called this “instinctive drift”).
Notice, however, that although the operant part of the experiment failed,
the animals were, in fact, learning something. Each was associating a salient
cue (or CS) with food. During shaping, the pig and raccoon presumably
had many coin-food pairings, and the chicken had several opportunities
to learn that the 15-second interval—or the miniature baseball—signaled
food. Each of these CSs then evoked a natural food-getting behavior. As
Bolles had discovered in avoidance learning, a lot of learning in operant
situations is about Pavlovian signals, and these signals can sometimes exert
control by eliciting natural behaviors.
Superstition revisited
The idea that reinforcement can fail seems inconsistent with the view that
reinforcers are so powerful that they can even increase the strength of a
behavior that they are accidentally paired with. This idea is the theoretical
basis for “superstitious behavior” that we discussed in Chapter 7. You may
recall that in the “superstition experiment,” Skinner (1948) delivered food
to pigeons at regular intervals—regardless of what the pigeons happened
438  Chapter 10

to be doing. When Skinner looked in on the birds a few hours later, some
of them were thrusting their heads in the corner, some were rocking back
and forth, and so on, as if these responses had been required to obtain food.
Skinner suggested that somewhere along the line each bird had emitted a
chance behavior at the instant that food was presented. The effect of this
accidental pairing of behavior and reinforcer was to make the bird respond
enough so that the behavior was (accidentally) rewarded again when the
next reinforcer was presented. The implication was that even arbitrary,
accidental pairings seem to cause change. How can this idea be reconciled
with the kinds of failures suggested by Breland and Breland (1961)?
Staddon and Simmelhag (1971) repeated Skinner’s experiment and
carefully watched the birds’ behavior develop over time. What emerged
was a pattern of behaviors between successive presentations of food like
the one illustrated in Figure 10.13A. As time went by after a presentation
of food, a behavior began to emerge and fill up all the time until the next
reinforcer was delivered. Because this behavior occurred at the end of the
food-food interval, Staddon and Simmelhag called it terminal behavior.
In every bird, terminal behavior involved pecking movements, usually

(A) Pigeon (B) Rat

1.0
Magazine Water Feeder
wall bottle area area
0.8
Drinking
Probability

0.6
Window Pecking Running
wall wheel area
0.4
Wing Running
flapping Other
0.2

0
2 4 6 8 10 12 10 20 30
Time since food (s) Time since food (s)
O O O O

Figure 10.13  Behavior of a pigeon (A) and rat (B) given food Os at regular
intervals. The pigeon was given food every 12 seconds. Soon after the O, there
was behavior directed toward the window of the Skinner box (window wall) and
Q: I did a bit of expanding on the part (B) graph as it was cramped compared to (A).
also some
Version 1 haswing flapping,
graph but part
lines in each theseall behaviors then declined over time as the next
distinct colors.
food
Versionpresentation
2 utilizes a key became imminent—those
as suggested are interim
and colors all behaviors behaviors
either as Interim(or adjunctive
(blue) or
behaviors).
Terminal (red)Other
but stillbehaviors (being near
labels the individual the associate
activities magazinewith
wall andplot
each pecking)
line. increased
and then peaked toward the end of the interval, when food was expected—those
are terminal behaviors. The rat (B) was given a food pellet every 30 seconds. Here
again, some behaviors peaked early and then declined (they are interim behav-
iors). In contrast, being near the feeder peaked toward the end of the interval
when food was expected (a terminal behavior). (A, after Staddon & Simmelhag,
1971; B, after Staddon & Ayres, 1975; figure inspired by Staddon, 1977.)
A Synthetic Perspective on Instrumental Action   439

oriented toward the front of the chamber or near the food hopper. Although
the pecking tended to occur when food was about to be presented, careful
analysis suggested that it was not the behavior that was paired with food
early in training. For example, early in training, the typical behavior that
was actually paired with the next reinforcer was the bird having its head
in the hopper, not pecking near the front wall. Such observations suggest
that if reinforcement was operating here, head-in-hopper should have been
learned as the bird’s terminal behavior.
Earlier in the food-food interval, other kinds of behaviors were more
probable. These behaviors varied between birds, although an individual
bird would be fairly persistent at demonstrating the same behaviors from
interval to interval (e.g., some birds turned in three-fourths of a circle, some
birds flapped their wings). These behaviors were called interim behaviors.
It is not entirely clear what Skinner (1948) had originally observed, but
many of his superstitious behaviors were probably interim behaviors. The
important thing, though, is that according to Staddon and Simmelhag’s ob-
servations, the interim behaviors were never occurring at the moment that
a reinforcer was delivered. They were behaviors that the pigeons seemed to
do to pass the time before the next reinforcer was going to happen. There
was very little operant learning in their superstition experiment. Instead,
the birds learned about the time between food deliveries and began predict-
ing food based on a temporal CS. When time predicted feeding, terminal
behaviors were evoked. When time did not predict food for a while, the
bird engaged in behavior that seemed to kill or pass the time between bouts
of terminal behavior.
Interim behavior is related to many adjunctive behaviors that experi-
menters have observed in operant situations (Falk, 1961, 1977). Perhaps
the most remarkable of them is a phenomenon called schedule-induced
polydipsia (Falk, 1961). In the typical experiment, rats are reinforced for
pressing a lever in a Skinner box with a food pellet delivered on an inter-
val schedule (see Chapter 7). A water bottle is attached to the side of the
chamber. The animal presses the lever as expected, but remarkably, soon
after earning each reinforcer, the rat goes over to the water bottle and con-
sumes an excessive amount of liquid before returning to the lever-pressing
task (Figure 10.13B). Falk (1961) reported that in a 3-hour session, a rat on
a VI 1-minute schedule drank about half its body weight in water. (Note
that there is relatively little cost to excessive drinking like this because the
rat has a bladder.) Interestingly, if an ethanol solution is available, rats
will drink so much of it over sessions that they eventually show signs of
becoming alcohol-dependent (e.g., Falk, Samson, & Winger, 1972; Sam-
son & Pfeffer, 1987; for one application to humans, see Doyle & Samson,
1988). Animals will perform a variety of behaviors as adjuncts to schedules
of reinforcement. For example, pigeons may attack another pigeon (e.g.,
Cohen, Looney, Campagnoni, & Lawler, 1985), and rats may run in wheels
(Levitsky & Collier, 1968; Staddon & Ayres, 1975). One explanation is that
schedules of reinforcement engage approach behavior when they are rich
440  Chapter 10

and a tendency to escape when they are lean. Adjunctive behavior tends
to emerge at intermediate reinforcement rates, when there may be conflict
between the tendencies to approach and escape (Falk & Kupfer, 1998). The
phenomenon may be analogous to certain “displacement activities” noted
by ethologists in conflict situations (e.g., Falk, 1977). For example, a gull in
a dispute with another gull over a territorial boundary may often stop its
aggressive posturing and grab and pull a wad of grass in its beak. Adjunc-
tive behaviors might also be part of the natural behavior system evoked
by the reinforcer (e.g., Lucas, Timberlake, & Gawley, 1988; Timberlake,
2001; Timberlake & Lucas, 1985, 1991). For example, interim (adjunctive)
and terminal behaviors may be seen as focal search and consummatory
behaviors that occur at different temporal distances from the next reward
(see Chapter 5). Timberlake (2001) also noted that excessive drinking may
occur because drinking is part of the rat’s natural postmeal repertoire—a
drink after a meal is functional because it aids digestion. Thus, it is a post-
food (rather than prefood) part of the behavior system that might come to
the fore when the animal learns to expect food at regular intervals.
You may have seen examples of adjunctive behaviors in computer
rooms where students can be found drumming with their pencils, biting
their fingernails, scratching their heads, or fiddling with candy wrappers
or water bottles between bursts of actual work. These behaviors can be-
come habitual, and they basically fill the gaps between reinforcers. Many
writers have argued that this process controls many repetitive and some-
times harmful behaviors in humans (e.g., Cantor & Wilson, 1985; Falk &
Kupfer, 1998). For the present, though, it is worth emphasizing that some
rather surprising behaviors can emerge and dominate when reinforcers
are scheduled intermittently. And, importantly, they seem to have little
to do with reinforcement in the traditional law-of-effect sense: No direct
response-reinforcer pairing has “stamped” them in (for recent discussions
of this possibility, however, see Killeen & Pellón, 2013; Boakes, 2015). As the
name implies, adjuncts are mostly seen as adjuncts to operant behaviors,
and they appear to occur at precisely the time when periodically scheduled
and predictable reinforcers are not about to occur.
A general role for stimulus learning in response learning situations
One intriguing feature of the revisited pigeon superstition experiment
(Staddon & Simmelhag, 1971) is the pecking behavior itself. Although other
replications of the superstition experiment have found far less pecking
(e.g., Timberlake & Lucas, 1985), there is no doubt that pecking can be
powerfully controlled by Pavlovian—as opposed to operant—contingen-
cies. “Autoshaping” was introduced in Chapter 3. That phenomenon was
discovered by Brown and Jenkins (1968), who put pigeons into a standard
Skinner box in which birds can direct key pecks at the usual key. All Brown
and Jenkins did was illuminate the key for 8 seconds and then present the
food. Within about 50 trials like this, the typical bird went up to the key
when it was illuminated and pecked at it. The key-pecking behavior was
A Synthetic Perspective on Instrumental Action   441

automatically shaped, and the rest is history. Pecking appears to result


automatically from Pavlovian pairings of the key light and food, and au-
toshaping has become a standard method for studying Pavlovian learning.
In an experiment that followed closely on the original, Williams and
Williams (1969) repeated the study with a devious twist. As in the Brown
and Jenkins (1968) experiment, the key light was turned on and paired with
food. But, in this case, the food only occurred if the bird did not peck. If
the bird pecked, the light was turned off, and the presentation of food was
prevented. There was therefore a negative contingency between pecking
and reinforcement, a so-called omission contingency (Sheffield, 1965). If
the pecking response was controlled by its consequences, the bird should
not have pecked. On the other hand, if the bird did not peck on every trial,
at least some key light pairings will have still occurred, and the key light
may have remained a positive predictor of food. What happened in this
experiment was famous. The birds continued to peck on most of the trials
when the key light came on. Despite the negative peck-reinforcer contin-
gency, the behavior was maintained. This phenomenon is therefore known
as negative automaintenance. The pecking response was elicited by the
light coming on, but it was not strongly controlled by its consequences.
Both autoshaping and negative automaintenance suggest that pecking is
largely a respondent behavior rather than an operant. The pigeon behaves
as if it cannot help pecking at a key light signaling food.
Negative automaintenance is really very similar to the misbehavior
noted by Breland and Breland (1961). In either case, a signal for food elicited
food behavior that, in the long run, resulted in the omission of reward. The
results generally tell us that Pavlovian contingencies embedded in experi-
ments can produce surprisingly strong control over behavior. Autoshaping
and negative automaintenance also began to establish the phenomenon of
sign tracking—that is, animals tend to approach CSs for positive events
(see Chapter 2). In fact, according to Jenkins and Moore (1973), CSs also
do something more specific. When pigeons receive pairings of one colored
key light with food and another color key light with water, the birds use an
eating movement to peck the food key and a drinking movement to peck
the water key—as if they are eating and drinking the CSs. Moore (1973; see
also Bindra, 1972) noted that Pavlovian contingencies may subtly control
the behavior observed in almost any operant situation. When a laboratory
operant is shaped by successive approximations, for example, the animal
is first reinforced in one location (typically near the food cup) and is then
required to move away from that site and more toward the lever or key.
At each step, the animal may notice new stimuli, and these new CSs may
now become associated with the reinforcer and evoke consummatory or
approach responses. Even operant response chains (see Chapter 7) can
follow from Pavlovian rules. Although the outcome of one response in a
chain is thought to reinforce the previous response and set the occasion for
the next, it is conceivable that each behavior brings the subject into contact
with a new stimulus that evokes the next behavior. Although the case is
442  Chapter 10

not necessarily proven, sign tracking can go some distance in explaining


behavior in complex situations.
What does this mean for human behavior? Arthur Tomie (e.g., 1996,
2001) noted that many addictive disorders involve the handling of small
implements that can become CSs directly associated with reward. The opi-
ate user handles a needle and syringe, a crack smoker handles a pipe, and
an alcoholic might handle beer or whiskey bottles that are associated with
alcohol over many trials. When an activity involves handling a salient
CS that is associated with reward, the kinds of Pavlovian sign-tracking
processes under discussion might combine with purer operant learning to
make behavior especially strong and seemingly compulsive. (We consid-
ered such processes in Chapter 9.) There is something compulsive-like in
the behavior of the pigeon in the negative automaintenance experiment
that keeps pecking as if it cannot be helped. The more general point,
though, is that S-O contingencies and R-O contingencies almost always
go hand-in-hand in any learning situation. If we want to understand a
particular behavior—even a nominally voluntary or operant behavior—
we will do well to consider both the operant and embedded Pavlovian
contingencies.
Punishment
We are thus suggesting that either S-O or R-O learning can often lead to
the same behavioral outcome. Now, let us consider their role in punish-
ment, where organisms stop performing behaviors that have unpleasant
consequences. Punishment is not uncommon in life. For example, to stop a
child from stealing cookies from a cookie jar in the kitchen, a parent might
punish the behavior with a mild scolding (“Stop it!”). Notice, though, that
punishment might work (the child might stop stealing cookies) for at least
two reasons. First, the child might quit because he or she learns that the
response (stealing cookies) is connected with the punisher (R-O learning).
Alternatively, the child might learn that certain stimuli in the environment
predict that punishment can occur (S-O), and he or she might merely stay
away from those cues. For example, the child might learn that the cookie
jar is associated with punishment or that the combination of cookie jar and
parent in the kitchen is associated with punishment. Obviously, such learn-
ing might not prevent the theft of cookies if the parent leaves the house
for a while. A similar scenario is present when a state government tries
to keep drivers from speeding on the freeway. The police might give out
more speeding tickets—but will drivers slow down, or will they merely
learn to spot police cars in the distance (and perhaps buy radar detectors)
to avoid detection? Interestingly, recent research with rats suggests that
the effects of punishment can be specific to the context in which they are
learned (Bouton & Schepers, 2015). That is, punished behavior returns or
is “renewed” when it is tested outside the context in which it is punished,
much as extinguished behavior is renewed when it is tested outside the
context in which it is extinguished, as we saw in Chapter 5 (see Bouton,
A Synthetic Perspective on Instrumental Action   443

Todd, Vurbic, & Winterbauer, 2011; Nakajima, Tanaka, Urushihara, & Ma-
saki, 2000).
The fact that punishment contingencies can support either stimulus learn-
ing or response learning is a further reason that punishment is not always
effective at suppressing behavior. Exactly this point was made in some ex-
periments by Bolles, Holz, Dunn, and Hill (1980). In one experiment, a special
lever with handlebars was installed in a Skinner box. The rats could either
push the lever down or pull it out from the wall; both of these responses
were initially reinforced on variable interval schedules of reinforcement.
After the rats were performing each response at an equal rate, Bolles et al.
(1980) began punishing one of the responses (e.g., the action of pushing the
lever down) by presenting a mild electric footshock every tenth time that
pushing occurred. If the rat was associating the response with the shock,
pushing (but not pulling) should have specifically declined. But notice that,
when shocks were presented, the rat also must have been touching the lever.
Does this allow the lever to be associated with shock as well? If the rat learns
to associate the lever with shock, it might merely stay away from the lever,
and both pushing and pulling might decline.
The results of the two sessions in which one of the responses was pun-
ished are presented in Figure 10.14. (In actuality, half the rats were pun-
ished for pushing, and half were punished for pulling.) Notice that for the
first 20 minutes or so, both behaviors declined together. This might have
been the result of stimulus learning; the rats were either staying away from
the lever generally, or perhaps they had associated the box with shock.
Eventually, though, the unpunished behavior returned to its original
strength, whereas the punished behavior stayed low. The final difference
between the two responses suggests that the rats ultimately learned the
R-O contingency. They learned to discriminate between two responses (Rs)
that were directed at the same manipulandum (S).

Figure 10.14  The rate of two behaviors


(pushing and pulling) directed at the
100 Not same lever over two sessions while one of
punished
the behaviors was occasionally punished
80 with a mild electric shock. Both behaviors
Response rate

first declined, and then the unpunished


60 response eventually recovered. The initial
Punished decline might have occurred because the
40 lever was being associated with the shock
(S-O learning), so the rats withdrew from
20 the lever. The eventual discrimination
between responses, however, presumably
resulted from a more specific association
0 1 3 5 7 9 1 3 5 7 9 of the punished response with shock (R-O
5-minute periods learning). (After Bolles et al., 1980.)
444  Chapter 10

Figure 10.15  The suppression of press- 0.6


ing and lifting each of two levers (left and
right) when a left lift (as an example) was 0.5
punished with a mild electric shock. All
responses initially declined, perhaps be- Right press

Suppression ratio
0.4
cause the rats associated the environment Right lift
with shock (S-O learning). With contin- Left press
ued training, the rats suppressed both 0.3 Left lift
responses directed toward the left lever
(S-O learning), but also eventually began 0.2
to discriminate pressing from lifting (R-O
learning). As in Figure 10.14, S-O and R-O 0.1
learning both occur during punishment,
although S-O learning may happen first.
(After Bolles et al., 1980.) 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
5-minute periods

Other experiments in this series used a Skinner box with two levers
(Ss) that could each be either pressed or lifted for a food reward. Again,
all responses were reinforced on VI schedules, and at the end of a train-
ing phase, the rats were performing all the desired actions about equally.
(Interestingly, the rats pressed both levers with their paws, but they spon-
taneously learned to lift the levers with their noses.) In one experiment,
Bolles et al. (1980) punished one of the four responses (e.g., left lift) by
presenting a footshock every tenth time this response occurred. The re-
sults of this experiment are shown in Figure 10.15. As in the previous
experiment, all responses initially declined, perhaps because the rats were
associating the context with shock, and this was generally suppressing
performance. As time went on, the rats stopped performing both of the
responses directed at the punished lever—and only gradually learned to
discriminate the two responses on that lever. A subsequent experiment
punished either both responses at one lever (e.g., left lift and left press)
or both of the same responses regardless of the lever (e.g., both left and
right lifts). Rats in the group that could avoid shock by avoiding a single
lever learned much better than rats that needed to stop emitting one
response directed at both levers, although this was also learned. The
results suggest that both S-O and R-O learning can contribute to punish-
ment, but the results also show that the S-O piece is quickly evident in
behavior and the R-O piece becomes evident only later. Both types of
learning can suppress behavior, but the Pavlovian piece seems easier and
dominates. You might remember an experiment by Tolman, Ritchie, and
Kalish (1946b) in Chapter 7 that had much the same message regarding
place learning and response learning (see Figure 7.3). Rats that ran in a
“plus maze” from two different start locations found it easier to go to
a consistent place than to make a consistent response on the maze. Al-
A Synthetic Perspective on Instrumental Action   445

though it is possible that Tolman’s places and Bolles et al.’s levers were
unusually salient, the results suggest that stimulus learning may often
be easier than response learning.
Summary: What does it all mean?
As we saw in avoidance learning, animals can often direct natural species-
specific behaviors to cues in the experiment that are associated with sig-
nificant outcomes (Os). In many ways, this kind of learning can dominate
behavior, and in the case of misbehavior and negative automaintenance,
it can actually interfere with getting rewards that are otherwise available.
The message here, again, is that biology matters. And second, that every
operant situation also allows stimulus learning. Moreover, because of
the rules of sign tracking (we tend to approach signals for good things
and move away from signals for bad things), stimulus learning can often
provide a fairly complete explanation of behavior in an operant situation.
The contribution of Pavlovian learning in instrumental learning situa-
tions cannot be ignored.

A Cognitive Analysis of Instrumental Action


The research summarized above forces us to acknowledge that any instru-
mental learning situation is made up of S-O and R-O components. Figure
10.16 therefore returns us to a description of behavior that was introduced
in Chapter 1. We can think of any instrumental learning situation as one
in which we perform an action (R) that leads to a biologically important
event or outcome (O) in the presence of some set of stimuli (S). This is the
framework that Learning Theory provides for understanding all behavior.
As was illustrated in Chapter 1, a very large number of behaviors can be
summarized this way. If you are trying to understand a behavior in the
real world, identifying an S, an R, and an O is a very solid place to start.
In many ways, this general perspective was one of B.F. Skinner’s most
important contributions.
What we have been doing in this book is filling in the details. In
many classic instrumental learning theories—for example, Thorndike’s

Pavlovian
S O Figure 10.16  The various types
Occasion of learning that can occur in any
setting instrumental learning situation. The
description is the same as the one
Habit
introduced in Chapter 1 (see Figure
Instrumental/operant
1.18) except that terms often as-
sociated with learning each of the
R links are also indicated.
446  Chapter 10

and Hull’s—a reinforcing O was an event that mainly “stamped in” an


association between S and R. Similarly, although Skinner was not inclined
to talk about associations or unobserved processes like “stamping in,” he
assumed that the occurrence of O had a fairly direct effect of making R
more probable. Subsequent theories, like Mowrer’s two-factor theory or
the more “cognitive” style of two-process theory proposed by Rescorla
and Solomon (1967), brought in the additional role of S and its association
with O: In these views, conditioning of S is supposed to occur in parallel
with S-R learning and motivate R. But what has emerged since then is a
somewhat different view recognizing that animals associate all the items
in Figure 10.16; behavior comes about as a product of this knowledge
(e.g., Balleine, 2001, 2005; Bolles, 1972a; Colwill, 1994; Colwill & Rescorla,
1986; Dickinson, 1989, 1994; Dickinson & Balleine, 1994; Mackintosh &
Dickinson, 1979; Rescorla, 1987). The approach can be considered “cogni-
tive” because much of what is learned is not directly manifest in behavior.
The organism develops what seems to be a relatively rich representation
of the world and uses it to direct behavior. We will now see what this
perspective is all about.
Knowledge of the R-O relation
I have been arguing all along that organisms behave as if they associate R
and O. This short-hand description of operant learning may be reasonable,
but the classic theorists did not find it necessary to explain things this way.
So, it is worth asking what, exactly, the evidence is. The most important
finding supporting the idea is the reinforcer devaluation effect (e.g.,
Adams & Dickinson, 1981; Colwill & Rescorla, 1985a). The design of a
representative experiment is shown in Figure 10.17A (Colwill & Rescorla,
1985a). In the first phase, rats were shaped to perform two operant re-
sponses: pressing a lever and pulling a chain dangling from the ceiling of
the Skinner box. Each of these behaviors was reinforced by a different re-
inforcer—either a food pellet or a little taste of 32% sucrose solution. Thus,
for some of the rats, lever pressing produced a pellet, and chain pulling
produced sucrose. (Half the rats received the reverse.) In a second phase,
one of the reinforcers (e.g., the pellet) was paired with a lithium chloride in-
jection over a series of several trials. This pairing is the classic treatment that
conditions a taste aversion, and as a result of it, the rats rejected all pellets
that were offered at the end of the second phase. The reinforcer had been
“devalued” in the sense that it was now something that the rat rejected.
In the final test, the rats were returned to the Skinner box and allowed to
press the lever or pull the chain. This test was conducted in extinction so
that neither response was paired with a reinforcer again after the devalu-
ation treatment. Nonetheless, the rat’s behavior during testing reflected
that treatment. As shown in Figure 10.17B, the response whose reinforcer
had been devalued was suppressed; the rats preferred to perform the other
behavior. The result is rather subtle, but it has two clear implications. First,
A Synthetic Perspective on Instrumental Action   447

(A) Design (B) Test results


Phase 1 Phase 2 Test 10
R1 — O1, R2 — O2 O1 — Illness R1? R2?

Responses per minute


Figure 10.17  The reinforcer devaluation effect. (A) Ex-
perimental design. In the first phase, one response (R) was 6
R2
reinforced with one outcome (O), and another response was
reinforced with another outcome. In a second phase, one of
the outcomes was now paired with illness, which resulted in 4
a conditioned taste aversion to it. When the rats were then
allowed to perform either response in extinction, they stayed R1
2
away from the response that produced the reinforcer that had
since been paired with illness. (B) The test results indicate that
the rat remembered what action led to what outcome and
chose to make the response that would lead to the outcome 0 1 2 3 4 5
it currently valued. (After Colwill & Rescorla, 1985a.) Blocks of 4 minutes

the rats must have learned which behavior led to which reinforcer—that
is, they specifically stopped performing the response that delivered the
pellet. Second, whether the rats made a response was determined by how
much they valued its reinforcer—because they no longer liked the pellet,
they stopped performing the behavior that led to it. The classic view that
reinforcers merely stamp in an S-R association or increase the probability
of R has no way of explaining this result. Instead, the rat learned which
action led to which reinforcer and then engaged in that action depending
on how much it “liked” its reinforcer during the test.
There is another way to devalue a reinforcer without conditioning
an aversion to it. Just before the test, the animal can be allowed to feed
freely on the reinforcer. Doing so temporarily satiates the animal on that
foodstuff—that is, it causes the animal to reject any more of it, as if it is
tired of the substance or flavor, a phenomenon known as sensory-specific
satiety (e.g., Hetherington & Rolls, 1996). Experiments using this method
again initially train the rat that one response produces one outcome and
that another response yields another. Then, one reinforcer is devalued by
allowing the rat to consume it freely before the test. If the rat is then put
into the Skinner box and tested on the two responses in extinction, the
animal will tend not to perform the behavior that leads to the reinforcer
that it has just filled up on (e.g., Balleine & Dickinson, 1998; Colwill &
Rescorla, 1985a).
Selective satiation with food reinforcers produces similar results in hu-
mans (e.g., Tricomi, Balleine, & O’Doherty, 2009; Valentin, Dickinson, &
O’Doherty, 2007). Analogous experiments with very young children are
448  Chapter 10

Figure 10.18.  The reinforcer devaluation effect (A)


in young children. (A) Experimental setup used in
experiments by Klossek and Dickinson (2012). (B)
When the reinforcer (a brief cartoon clip) for one
response was devalued by habituation/satiation,
the children made that response (Devalued) less
frequently than another response whose previous
reinforcer had not been devalued (Not Devalued).
The y-axis is response rate during testing expressed
as percentage of the response rate observed during
the original training phase. (A, after Klossek & Dick-
inson, 2012; B, after Klossek et al., 2008.)

(B)

200 Devalued
Percentage responding Not devalued
150

100

50

0
Age 2 Age 3
Group

especially interesting. Klossek, Russell, and Dickinson (2008) had 2- and


3-year-olds sit in front of a computer screen (Figure 10.18A). When an
image of a butterfly appeared on the screen, the child could touch it and
drag it to either side. Dragging the image to the left was reinforced by
presentation of a brief (12-second) video clip from a television cartoon;
dragging it to the right was reinforced with a different cartoon. During a
reinforcer devaluation phase, the children watched longer, 36-second clips
of one of the cartoons repeated five times. (This got tedious and boring).
During an extinction test, the children could then choose between dragging
the butterfly to the left or to the right again. They reliably chose the action
that had led to the nondevalued cartoon (Figure 10.18B) Younger children
(1.5–2 years old) do not always show the devaluation effect (see also Ken-
ward, Folke, Holmberg, Johansson, & Gredebäck, 2009), but a later study
indicated that they do if they are given enough initial training (Klossek &
Dickinson, 2012). Thus, even for young children, instrumental behavior is
A Synthetic Perspective on Instrumental Action   449

influenced by (1) knowledge of what action leads to what outcome and (2)
how much the outcomes are currently valued.
We actually saw another type of experiment in which the reinforcer
was revalued after instrumental learning when we discussed the impor-
tance of incentive learning in Chapter 9. Organisms need to learn about
the effects different reinforcers have on their motivational states—that
is, how they assign value to the reinforcers. For example, remember that
Balleine (1992) had rats learn to press a lever for pellets while they were
completely satiated. During an extinction test, when the rats could press
the lever without pellets, Balleine made them hungry. Surprisingly, hun-
ger had no effect on the rate of lever pressing unless the rat had been
given an opportunity to eat the pellets while hungry. Doing so allowed
the rat to learn that the pellet made it feel good when it needed food.
Thus, the rat had knowledge that pellets are good when hungry, so when
it was made hungry during the test, it valued the pellet even more—and
responded more frequently accordingly. Remarkably, an analogous incen-
tive learning process seems to be involved in the devaluation experiment
shown in Figure 10.17. Over trials, when the pellets were being paired
with illness, the rats had several opportunities to learn that the taste of
the pellets was awful. That is, after the first pellet-sickness pairing, the
rats tasted the pellets and could associate them with a negative emotional
“yuck!” reaction. This is how they learned that they did not like the stuff
so much.
We know this incentive learning step is necessary because if rats do
not have a chance to taste the pellets after a single aversion condition-
ing trial, they do not show the reinforcer devaluation effect (Balleine &
Dickinson, 1991). A single flavor-illness pairing is not enough to allow
the rats to know that they do not like the pellet. Instead, the animal has
to taste the flavor again—and experience a negative reaction to it. This
point is further illustrated in a fascinating experiment summarized in
Figure 10.19A (Balleine, Garner, & Dickinson, 1995). The experiment used
a drug, ondansetron, which is an antiemetic that makes humans feel less
nauseated. Rats received a single session in which lever pressing led to
sucrose and chain pulling led to a saline solution (as usual, these rela-
tions were reversed in half the animals). They then received an immediate
injection of lithium chloride, which could create a taste aversion to both
the sucrose and the saline. In two separate sessions, the rats then received
reexposures to each reinforcer; these reexposures would ordinarily allow
the rat to associate each flavor with the “yuck!” reaction. Before reexpo-
sure to one reinforcer (e.g., saline), however, the rats were injected with
ondansetron; before reexposure to the other reinforcer (e.g., sucrose), they
were injected with a placebo. The ondansetron would reduce any nausea
that the rat might feel upon retasting the averted flavor. When the rats
were subsequently allowed to lever press or chain pull in extinction, they
mainly performed the behavior associated with the reinforcer that had
450  Chapter 10

(A) Design (B) Test results


Phase 1 Phase 2 Test 4
(R1 — O1, R2 — O2) — Illness Ondansetron: O1, Vehicle: O2 R1? R2? Ondansetron
Vehicle

Figure 10.19  Incentive learning plays a role in the reinforcer 3

Responses per minute


devaluation effect. (A) Two responses were first paired with
two different outcomes in a single session that preceded
illness. Although the illness would condition taste aversions
to both outcomes, the rat could not learn that it did not like 2
an outcome until it experienced its reaction to the outcome
when it tried it again. To test this idea, rats were allowed
to taste both of the outcomes in Phase 2. However, O1 was
1
tasted after the rats were injected with ondansetron, a drug
that reduces nausea. During retasting, the rat could experi-
ence the “yuck!” reaction to O2, but ondansetron suppressed
the “yuck!” reaction to O1. (B) When the rats were finally
tested with the two responses, they tended to stay away 0 1 2 3
from the one that had produced the outcome associated Blocks of 4 minutes
with the “yuck!” reaction. (After Balleine et al., 1995.)

been reexposed under ondansetron (see Figure 10.19B). They avoided


the behavior connected with the reinforcer that was retasted under the
placebo. The rats thus assigned value to each reinforcer according to how
it made them feel during the reexposure. In another experiment, injecting
ondansetron during the final extinction test had no affect on the choice
of lever pressing or chain pulling (Balleine et al., 1995; see also Balleine
& Dickinson, 1994, for related results). Thus, the rats did not need to
feel bad when they remembered sucrose or saline during the test (see
Wassum, Ostlund, Maidment, & Balleine, 2009, for related results). As
a result of incentive learning, they apparently remembered something
simple like “saline is good” and “sucrose is bad.” During the extinction
test, the rats used that information—along with knowledge of specific
R-O relations—in deciding how to respond.
These results suggest that operant learning involves some very subtle
learning processes. There are three things to remember. First, the organ-
ism associates its operant behavior with the outcome that the operant be-
havior leads to. Second, the organism decides what to do based on how
it currently values the reinforcer associated with each action. Third, value
is assigned through incentive learning. For this process to take place, the
animal must experience the reinforcer—potentially, in the appropriate mo-
tivational state—to know how much the animal likes it in that state. All
three processes come into play in operant/instrumental learning.
Once we accept that organisms associate R and O, the next question is,
how do they actually learn the connection? The answer, at least partly, is that
the laws governing R-O learning are similar to those that govern S-O (Pav-
A Synthetic Perspective on Instrumental Action   451

lovian) learning. In fact, I have encouraged you to accept this idea all along.
For example, the parallels between R-O and S-O learning were discussed
back in Chapter 2, when you were shown that they are sensitive to the same
kinds of factors (e.g., extinction, timing of O, size of O, and preparedness).
It is also worth noting that R-O learning may be sensitive to the “informa-
tional variables” that seem so important in Pavlovian learning. Chapter 3
described contingency learning, blocking, and relative validity; these effects
were the ones that led to the conditioning theories reviewed in Chapter 4.
Analogous informational effects have been demonstrated in operant learn-
ing. For example, we can reduce the contingency between R and O by merely
presenting O while the animal is not performing R. This reduces the rate of
operant responding (e.g., Colwill & Rescorla, 1986; Dickinson & Mulatero,
1989; Hammond, 1980; Williams, 1989). As we saw above, learned helpless-
ness is another parallel with contingency learning. Others have reported
overshadowing, blocking, and relative validity effects in operant learning
(e.g., Mackintosh & Dickinson, 1979; Pearce & Hall, 1978; see also Lovibond
et al., 2013). For example, St. Claire-Smith (1979) had lever-pressing rats
receive a mild electric shock a few milliseconds after an occasional lever
press. The shocks caused punishment, and lever pressing declined once the
shocks were introduced. If, however, a brief CS was paired with shock prior
to the punishment test and the CS was inserted between the response and the
punishing shock, there was less punishment—as if the CS “blocked” learning
of the R-shock association. This kind of result suggests that factors we think
are so important in S-O learning—that is, surprisingness of O, prediction
error, the extent to which it is rehearsed or processed, and so forth—may be
important in operant learning, too.
Nonetheless, there is probably something more than a simple Pavlovian
association between R and O. As we saw in Chapter 7, we can allow R to
produce O according to either a ratio schedule (O is delivered after every
nth response) or an interval schedule (O is delivered on occasion of the first
response after some temporal interval since the last O). These schedules can
have quite different effects. For example, response rate is higher on ratio than
interval schedules when the rate of reinforcement is equated between them
(e.g., Catania, Matthews, Silverman, & Yohalem, 1977). There is a paradox
here: If response rate is higher on the ratio schedule than on the interval
schedule when reinforcement rate is otherwise the same, the percentage of
responses actually paired with O must be lower. By most rules of associative
learning, the strength of the R-O association (associative strength) should con-
sequently be lower, but how could lower associative strength lead to stronger
responding? Perhaps even more relevant to our discussion is that the rein-
forcer devaluation effect is stronger if R has been trained with a ratio schedule
rather than with an interval schedule (Dickinson, Nicholas, & Adams, 1983);
that is, a powerful aversion to the O has less influence on the interval-trained
response, as if knowledge of the R-O relation is less important.
The difference between ratio and interval schedules may be fundamen-
tal. As we saw in Chapter 7, ratio and interval schedules differ in terms of
452  Chapter 10

their molar feedback functions. In the ratio schedule, response rate directly
affects reinforcement rate. If the rat makes more Rs, it always earns more
Os; if it makes fewer Rs, it earns fewer Os. Thus, over time, reward rate is
highly correlated with response rate (e.g., Baum, 1973). This is not as true in
an interval schedule: Once the animal is responding enough to receive the
maximal rate of O (say, one per minute in an FI 1-minute schedule), increas-
ing response rate has no further effect on reward rate. Reward rate is not as
strongly correlated with response rate. Perhaps knowledge of the connection
between action and outcome (R-O) depends on the strength of the action-
outcome correlation (e.g., Dickinson, 1989, 1994). Some writers (e.g., Balleine,
2005; Dickinson, 1994) have claimed that because of the stronger correlation
between R and O, ratio schedules engender true action-outcome learning,
whereas interval schedules might generate habit (discussed further below).
Knowledge of the S-O relation
We have already seen that it is useful to acknowledge the role of stimulus
learning whenever operant learning occurs; as emphasized in Chapter 9,
such learning is minimally thought to motivate instrumental responding.
To further push the general parallel with R-O learning, Pavlovian CSs also
signal specific Os, which is very clear in simple Pavlovian experiments. For
example, in Chapter 2 we saw that after conditioning has occurred, one
can modify the animal’s representation of the US (e.g., by habituating the
response to it), and behavior changes accordingly (e.g., Rescorla, 1973a,
1974). And there have been experiments more directly analogous to those
in the reinforcer devaluation effect. For example, Holland and Rescorla
(1975b) and Holland and Straub (1979) paired a CS with food pellets and
then conditioned an aversion to the pellets. Doing so reduced the strength
of the conditioned response to the CS when it was tested again, suggest-
ing that the rat associated the CS with a particular US (see also Colwill &
Motzkin, 1994). Peter Holland (1990b; see also 2005) has run many very
sophisticated experiments suggesting that rats encode a very rich repre-
sentation of the US during Pavlovian learning.
There is also evidence that a stimulus trained as an SD in an operant
setting conveys specific information about the reinforcer. For example, in
an experiment by Colwill and Rescorla (1988) described in Figure 10.20A
(see also Colwill & Triola, 2002), rats learned to poke their noses into a
hole in the wall of a Skinner box during both Light and Noise SDs. In the
presence of the Light, nose poking produced one reinforcer (e.g., a pellet),
whereas in the presence of the Noise, it produced another (e.g., sucrose).
Because of the training, the rats did most of their nose poking when the
Light or the Noise was turned on. The question, however, was, were Light
and Noise also specifically associated with pellet and sucrose, respectively?
To answer this question, lever-press and chain-pull responses were each
associated with one of the reinforcers in a second phase. The Light and
Noise were never presented at this time. In a final test, however, the Light
and Noise were presented while the rats could press the lever or pull the
A Synthetic Perspective on Instrumental Action   453

(A) Design
Phase 1 Phase 2 Test
L: R1?, R2?
L: RP — O1, N: RP — O2 R1 — O1, R2 — O2
N: R1?, R2?

Figure 10.20  Discriminative stimuli are associated with


(B) Test results
specific Os. (A) In this experiment, rats first learned
10 to poke their noses into a hole (RP) in the presence of
Same Light (L) and Noise (N) SDs. In the Light, the nose poke
Different was reinforced with one outcome, and in the Noise, it
8 Baseline was reinforced with another. (Poking was not reinforced
at all in the absence of the Light or the Noise.) In the
Responses per minute

next phase, different responses were then reinforced


6 with the two outcomes. (B) In a final test (conducted in
extinction), the Light and Noise both increased the rate
of the new response that had been associated with the
4 Same outcome, but not the Different outcome. (After
Colwill & Rescorla, 1988.)

0
1 2 3 4
Blocks of 2 trials

chain. As Figure 10.20B shows, the SD mainly affected the new response
that had been associated with the same reinforcer. Thus, the SD was indeed
associated with a particular reinforcer. “Transfer tests” like this one suggest
that both R and S are associated with specific Os.
In Chapter 9, we discussed the effects of similar transfer tests, called
Pavlovian-instrumental transfer tests, in which Pavlovian CSs rather than
operant SDs have been presented while the animal is performing an op-
erant response. We saw that the effects of presenting Pavlovian CSs are
actually multiple and complex. First, like the S D in Figure 10.20, a simple
CS can also selectively enhance operant responses associated with the
same reinforcer (i.e., “outcome-specific transfer”; e.g., Corbit et al., 2007;
Delamater, 1996; see Figure 9.16). In this way, a Pavlovian CS can guide
the animal’s choice of which of several responses to perform. Perhaps, as
noted earlier, one effect of a CS is to evoke a representation of the sensory
aspects of a particular reinforcer. But remember that there is a second,
more general effect: A CS can also have a more general motivating or
arousing effect on responses that are not necessarily associated with the
same Outcome (i.e., “general transfer”; e.g., Balleine, 1994; Corbit & Bal-
leine, 2005, 2011; Corbit et al., 2007). This effect is more like the process
envisioned by two-process theory (Rescorla & Solomon, 1967; see also
Konorski, 1967). A third possible effect of a CS, of course, is one that was
454  Chapter 10

emphasized in earlier parts of this chapter: CSs associated with Os can


also evoke natural behaviors of their own. S-O learning thus has several
effects on behavior in instrumental situations: It can evoke a representa-
tion of a specific O that strengthens behaviors associated with the same
O, it can cause a general arousing effect, and it can evoke CRs (e.g., fear
or sign tracking or goal tracking) on its own.
The learning process behind S-O learning was thoroughly examined in
Chapters 3 through 6. A complete review of all that material is, of course,
beyond the scope of this chapter, but given the role of S-O learning in
operant behavior, it is worth at least noting the relevance of this material.
Thus, for example, we know that the learning process tends to find the most
valid or reliable predictor of O, as is suggested in effects like blocking and
relative validity. We also know that attention and memory processes may
play a role. It should also be added that we specifically discussed how the
stimulus guides operant behavior in Chapter 8, where we encountered
additional stimulus-learning processes (such as perceptual learning and
mediated generalization) and explicitly considered the effects of complex
stimuli like categories, space, and time. Instrumental behavior occurs in
the presence of many kinds of stimuli, and a synthetic view of instrumental
action must acknowledge them all. Interestingly, one theme in the Chapter
8 discussion was that the learning and memory processes involved when
complex stimuli guide instrumental action may not be that different from
those in “simpler” forms of Pavlovian learning.
Another issue comes up as we consider interactions between S-O and
R-O learning. In the operant situation, an R must always occur for the O
to appear—that is the very definition of an instrumental action. Because of
this requirement, if we are to consider informational variables, we must ac-
knowledge that the R is actually more informative than the S about the next
presentation of O. Shouldn’t the R therefore compete with the S for association
with O? The answer is yes. Ultimately, we also need to acknowledge another
important point: The SD is more than just a Pavlovian CS. It also provides
information about the R-O relationship. That is, the SD truly sets the occasion
for the R-O relationship, another relationship represented in Figure 10.16.
S-(R-O) learning (occasion setting)
Skinner (1938), of course, had always claimed that the stimulus in operant
conditioning set the occasion for the response-reinforcer relationship. As
we have previously seen in this chapter, however, often this “three-termed”
role reduces to simple S-O learning; the key peck is often merely elicited
by excitation conditioned to the key light CS. On the other hand, there is
good evidence of true occasion setting, and some of it even comes from re-
search using the pecking response. In a very thoughtful experiment, Jenkins
(1977), of the Brown and Jenkins autoshaping experiment, set things up as
illustrated in Figure 10.21. Reinforcers were scheduled by a computer so
that they were equally probable whether a key light was on or off. Because
the key light therefore predicted no change in the probability of the US, it
A Synthetic Perspective on Instrumental Action   455

CS

R R R

Figure 10.21  If reinforcers are presented with the same probability in the
presence and absence of a CS, the CS will not be treated as a signal for O
(e.g., Chapter 3). If, however, the CS nonetheless signals that a response is now
required to earn the reinforcer that is otherwise presented for free, the animal
will learn to respond during the CS. Because the CS cannot be a Pavlovian exci-
tor, responding cannot result from simple S-O learning—it must result from true
S-(R-O) learning in which the animal learns that R is associated with O during S.

could not be an excitor or an inhibitor. But Jenkins also arranged a clever


twist: When the key light was off, reinforcers were presented regardless of
what the pigeon was doing, but when the key light was on, the bird had
to peck the key to produce them. Thus, although the key light could not
have been a simple Pavlovian CS, it did convey some important informa-
tion: It said, “Pigeon, you must now peck to get those reinforcers that you
otherwise get for free.” Under these conditions, pigeons did learn to peck
when the key light was turned on (and not when it was off). Not only was
the pecking response controlled by its consequences (and therefore at least
partly an operant after all!), but the key light served as a stimulus that truly
signaled the relationship between R and O.
There is other good evidence that an SD can control behavior in a way
that does not reduce to its simple association with O. Consider the experi-
ment sketched in Figure 10.22A (Colwill & Rescorla, 1990b). Rats were

(A) Design (B) Test results


Phase 1 Phase 2 Test 12
L: R1 — O1, R2 — O2 L: R1?, R2?
O2 — Illness
N: R1 — O2, R2 — O1 N: R1?, R2? 10

Figure 10.22  A clear demonstration of S-(R-O)


Responses per minute

learning. (A) In Phase 1, different responses were 8


paired with different outcomes, but the specific out-
come depended on whether a Light (L) or a Noise 6
(N) SD was present. In the second phase, a taste
aversion was conditioned with O2. (B) During the
4
final extinction tests, the rats stayed away from the
specific response that led to the devalued outcome
in each of the SDs. To perform this way, the rats must 2
have learned that R2 led to O2 in the Light and that
R1 led to O2 in the Noise, in an example of S-(R-O)
0
learning. (After Colwill & Rescorla, 1990b; data from R1 R2 R1 R2
Rescorla, 1991.) Light Noise
456  Chapter 10

put through an experiment involving two behaviors (lever pressing and


chain pulling—what else?), two Os (pellets and sucrose), and two SDs (Light
and Noise). In the first phase, when the Light came on, R1 was associated
with pellets, and R2 was associated with sucrose. But when the Noise came
on, the reverse was true: R1 was now associated with sucrose, and R2 was
now associated with pellets. Both Noise and Light were equally associ-
ated with both Os and both Rs. In the second phase, one of the reinforcers
was now paired several times with lithium chloride to create reinforcer
devaluation. In a final test conducted in extinction, the rats were allowed
to press the lever and pull the chain when the Light and the Noise were
presented. As Figure 10.22B indicates, the rats demonstrated that they
knew that the averted reinforcer followed R2 when the Light was on, but
that it followed R1 when the Noise was on. Thus, the Light and the Noise
signaled specific relationships between the Rs and the Os. Similar results
were reported by Trask and Bouton (2014); in their case, different contexts
(instead of a Light and a Noise) signaled the different R-O relationships.
The results of these experiments require that we accept that the rats had a
fairly sophisticated knowledge of the various contingencies. There can be
no question that there is more going on in operant learning than O merely
stamping in a response.
You may recognize that this occasion-setting effect is directly analo-
gous to one we considered in Pavlovian learning (see Chapter 5). In the
Pavlovian case, a stimulus signals a particular S-O relationship: S-(S-O).
The parallel is further reinforced because a Pavlovian occasion setter
can apparently substitute for an operant SD (e.g., Davidson, Aparicio, &
Rescorla, 1988). As in other forms of learning, S-(R-O) learning requires
that S be informative rather than redundant. In this case, S must be in-
formative about the occurrence of a particular R-O relationship (e.g.,
Rescorla, 1991).
S-R and “habit” learning
It is interesting to note that despite all the evidence of R-O, S-O, and S-(R-
O) knowledge, there is still a place in the synthetic view for good old S-R
learning. The basic argument (e.g., Colwill, 1994; Colwill & Rescorla, 1985a;
Dickinson, 1989) is that even after a reinforcer devaluation manipulation is
thorough and complete so that the rat completely rejects the O when it is
offered, the animal will still go over and perform the action a little during
the extinction test. If you look closely at the level at which the rat performs
the response associated with the devalued reinforcer in Figure 10.17, for
example, you will see that it is still performing that action a little bit—the
rat is simply doing it out of habit. If put into the box, the rat will run over
to the lever and press it a few times without much regard to the action’s
actual consequences.
There is a fair amount of interest in the idea that the S-R association
begins to take over as an action is performed repeatedly. The idea seems
A Synthetic Perspective on Instrumental Action   457

consistent with our everyday intuitions about the difference between true
goal-oriented behavior and simple habits. Although I presumably used
to make coffee when I got out of bed because I liked its effects, I have by
now made coffee so many times that I can practically do it in my sleep,
and certainly with little thought about the process. In fact, a surprising
amount of everyday human behavior occurs repetitively and automatically,
without much awareness (e.g., Bargh & Chartrand, 1999; Wood & Rünger,
2016). From a functional perspective, the conversion of actions into habits
makes sense; our working memories are thus liberated to devote space to
other mental problems. According to the philosopher Alfred North White-
head (1911), “Civilization advances by extending the number of operations
which we can perform without thinking about them.”
In fact, there is evidence consistent with the idea that habit takes over
with repeated practice. For example, an operant behavior that is given
extensive training can be quite insensitive to the reinforcer devaluation
effect. Holland (2004) gave different groups of rats 2, 5, or 20 sessions of
instrumental training in which lever pressing produced food pellets. Dif-
ferent groups then had the pellets paired (or not paired) with injection of
lithium chloride. Although the pairings appeared to produce equivalent
aversions to the food pellets in the paired groups, when the rats were
returned to the Skinner box and tested for lever pressing in extinction,
only the groups given 2 or 5 sessions of instrumental training suppressed
their lever pressing (Figure 10.23). Thus, the rats behaved as if a highly
practiced operant was less dependent on their knowledge of the R-O as-
sociation (see also Adams, 1982; Adams & Dickinson, 1981; Corbit, Nie, &
Janak, 2012; Dickinson, Balleine, Watt, Gonzalez, & Boakes, 1995; Killcross
& Coutureau, 2003; Thrailkill & Bouton, 2015).

12
Devalued Figure 10.23  The reinforcer devaluation effect
10 Not devalued
weakens with extensive instrumental training.
Rats received 2, 5, or 20 sessions of operant le-
Responses per minute

8 ver-press training, and then the pellet reinforcer


was paired with illness in some rats (Devalued)
6 or presented separate from illness in others
(Not devalued). In the extinction tests shown,
4 reinforcer devaluation suppressed lever-press
responding in the groups that had received
only 2 or 5 sessions of instrumental training, but
2
not 20. With extensive training, instrumental
responding may depend less on knowledge of
0
2 5 20 the R-O association and more on the S-R habit.
Number of instrumental training sessions (After Holland, 2004.)
458  Chapter 10

Comparable results have been shown in humans. Tricomi et al. (2009)


had adults press one button to get an M&M candy and another button to get
a tasty Frito corn chip, both on VI 10-second schedules. (The participants
actually received a computer image of each type of food; the correspond-
ing foods were made available later.) One group had two brief sessions of
training, whereas a second group had six times as much. The participants
were then given a bowl full of either M&Ms or Fritos and told to eat them
until the food was no longer pleasant. During extinction tests, participants
in the less trained group performed the action associated with the de-
valued outcome less than the other response—they showed a reinforcer
devaluation effect. In contrast, the overtrained group showed no differ-
ence. Extended, repeated training makes your instrumental behavior less
dependent on your knowledge of the R-O relationship and your current
interest in obtaining O. Extended training makes behavior less sensitive
to its consequences.
One explanation of all this is that, with extensive practice, the animal
or person’s rate of behavior becomes relatively constant over time. Thus,
he or she no longer experiences variation in both response rate and reward
rate—which would be necessary to perceive their correlation and thus
the relation between R and O (e.g., Dickinson, 1989). It is worth noting,
though, that extensive training does not always make an action immune to
reinforcer devaluation (Colwill & Rescorla, 1985b, 1988; Colwill & Triola,
2002). That is, under some conditions, operant lever pressing and chain
pulling are still determined by the rat’s knowledge of what leads to what,
even after extended training. Those conditions appear to be ones in which
rats are trained with multiple operants and multiple reinforcers (see Hol-
land, 2004; Kosaki & Dickinson, 2010). Exposure to more than one R-O
relationship in the same situation may encourage the animal to maintain
attention to its different behaviors, or, because it requires not performing
one R (and not earning its O) while engaging in the other R, it maintains
variation in response rate and reward rate—and thus a correlation between
each action and outcome over time (e.g., Dickinson, 1989). Perhaps repeti-
tion eventually turns your actions into habits unless there is some reason
to pay attention to what you are doing.
So, how are habits learned? One possibility is to consider a set of mod-
els that computer scientists have created known as reinforcement learning
models (e.g., Sutton & Barto, 1998). As a group, these models attempt to
capture the basics of instrumental learning by assuming that in a given
situation the organism will make a response depending on how strongly
it is associated with a reinforcer. No representation of the reinforcer is as-
sumed. Instead, the association is updated over time with a mathematical
equation that is closely related to the Rescorla-Wagner model (see Chapter
4). Reinforcement learning models that work this way can be seen as models
of habit learning. However, one problem with early models was that they
could not explain the reinforcer devaluation effect or predict when the ef-
A Synthetic Perspective on Instrumental Action   459

fect would work or not. (Remember that habits, which are not sensitive to
reinforcer devaluation, develop with lots of training and are prevented by
training with multiple responses and reinforcers.) Some models in this tra-
dition (e.g., Daw, Niv, & Dayan, 2005) have proposed two different learning
processes, one that supports habit and another that supports goal-directed
action learning. This model can handle the conversion of actions to habits
with extended training and the prevention of habit learning by training
with multiple responses and reinforcers. Many gaps still remain, though
(e.g., Dezfouli & Balleine, 2012; see also Gersham, Blei, & Niv, 2010; Redish,
Jensen, Johnson, & Kurth-Nelson, 2007). The development of reinforce-
ment learning models—and our understanding of habit learning—is still
a work in progress.
The distinction between S-R habits and more cognitively mediated
behavior is not really new, however. You may remember that Edward
Tolman was interested in the distinction as far back as the 1940s. He and
his students ran many ingenious experiments in mazes suggesting that
rats may be more ready to learn about places than simple S-R responses
(see Chapter 7). Some relatively modern research returned to some of
Tolman’s methods and made some interesting discoveries. For example,
Packard and McGaugh (1996) ran rats in the simple plus maze illustrated
in Figure 10.24A. The rats were first rewarded for going from a consis-
tent start place (the south) to a consistent goal location (the west) for four
trials each day over a period of several days. During a test trial on Day
8, they were started from a new location—namely, the north. Most of the
rats went west—that is, they returned to the rewarded place rather than
making the rewarded response (a left turn; see Figure 10.24B, top left).
They then received additional training for several more days—that is,
they were rewarded again for going west from the start location in the
south. When the rats were finally tested from the north again on Day 16,
they now turned left rather than heading west (see Figure 10.24B, top
right). Extended training thus converted the rat’s behavior from a goal-
or place-directed activity into a kind of S-R habit (see also Hicks, 1964;
Ritchie, Aeschliman, & Pierce, 1950).
Packard and McGaugh’s (1996) study was especially interesting because
they also temporarily inactivated different areas in the rats’ brains with
lidocaine during testing. One area they inactivated was the hippocampus,
which we have already seen is important in spatial learning (see Chapter
8). As shown in the middle of Figure 10.23B, during the first test, on Day
8, when control animals were choosing the correct place, inactivation of
the hippocampus abolished the place preference. However, after extended
training, when control rats had shifted to an S-R habit, inactivating the
hippocampus had no effect (Day 16). Packard and McGaugh also inacti-
vated another brain area—the caudate nucleus—in another group of rats.
As shown at the bottom in Figure 10.24B, inactivating the caudate after
initial training (when place-learning was evident) produced no change
460  Chapter 10

in behavior (Day 8). But after extended training (when S-R learning was
evident in controls), inactivation of the caudate had a big effect (Day 16). At
this point, caudate inactivation abolished the S-R response preference, and
the rats chose the correct place again! As habit takes over with extended
training, it appears to engage the caudate, but it does not abolish place
knowledge. Other evidence suggests that place and response learning can

(A) Apparatus (B) Tests


Place
(Test) Control Response
12
N
10

(Goal) W E 6

2
S 0
(Start)
Hippocampus inactivated
12

10
Figure 10.24  More evidence that extensive training may
Number of animals

turn cognitively mediated behavior into habit. (A) A plus 8


maze. Rats were started from the south and rewarded for
6
going to the western goal location. During test trials, they
were started from the north. If they went west, they re- 4
turned to the rewarded Place; if they went east, they per-
formed the rewarded Response (turning left). (B) Number 2
of animals that went to the rewarded Place or performed
the rewarded Response during test trials conducted after 0
8 days or 16 days of training. Top: Control rats went to Caudate nucleus inactivated
the correct Place after 8 days of training, but chose the 12
rewarded Response after 16 days—as if the behavior
switched from place learning to an S-R habit. Middle: 10
Rats whose hippocampus was inactivated during the test
8
did not show place learning at Day 8, but showed normal
S-R learning at Day 16. Bottom: Rats whose caudate 6
nucleus was inactivated during the tests showed normal
place learning at Day 8 and continued to show place 4
responding at Day 16. Apparently, the caudate nucleus
is necessary for an S-R habit to be expressed, and when 2
habit takes over, it does not destroy knowledge about the
0
rewarded place. (After Packard & McGaugh, 1996.) Day 8 Day 16
A Synthetic Perspective on Instrumental Action   461

also go hand-in-hand, although they may compete with each other to some
extent (Gibson & Shettleworth, 2005).
There is now quite a bit of research that seems to separate place and
habit learning as well as the brain areas that underlie them (for reviews,
see Packard, 2001; White & McDonald, 2002). This work also complements
research correlating different brain areas with other forms of actions and
habits assessed by reinforcer devaluation methods (e.g., Balleine, 2005; Bal-
leine & O’Doherty, 2010). Research in behavioral neuroscience is consistent
with the purely behavioral research reviewed in this chapter in support-
ing the roles of S-O, R-O, S-(R-O), S-R, and incentive learning processes in
instrumental learning.

Go to the Companion Website at sites.sinauer.com/bouton2e for


review resources and online quizzes.

Summary
1. Two-factor theory addressed the question, what reinforces an avoidance
response? In modern terms, the theory proposes that (a) organisms
associate stimuli in the environment with an aversive O, which allows
those stimuli to evoke fear; and (b) the avoidance response is reinforced
when it eliminates or escapes those warning stimuli and therefore
causes fear reduction. Two-factor theory emphasizes the interaction
between stimulus learning (Pavlovian fear-conditioning) and response
learning (operant/instrumental reinforcement through fear reduction).
2. Two-factor theory was challenged by the fact that avoidance learning
can occur if the response simply reduces the rate of aversive stimu-
lation, without any explicit warning stimuli, and by the fact that the
strength of avoidance behavior is not correlated with overt levels of
fear. These challenges were addressed by noting that temporal cues
that predict O can become conditioned fear stimuli and that “fear” is
best defined as a central state or expectancy rather than a peripheral
response (see Chapter 9).
3. Two-factor theory’s emphasis on reinforcement by fear reduction ran
into difficulty when it was discovered that escaping the warning signal
is not important when the animal can avoid by performing a natural
behavior that has presumably evolved to avoid predation—a so-called
species-specific defense reaction (SSDR).
4. SSDR theory emphasized the organism’s evolutionary history. Avoidance
learning was thought to occur rapidly if the required response resem-
bled a natural defensive behavior. If not, learning depends more on
feedback (or perhaps reinforcement provided by the inhibition of fear,
which is provided by feedback cues).
462  Chapter 10

5. The field’s approach to avoidance behavior has become more etho-


logical (it considers the function of natural defensive behavior), more
Pavlovian (natural SSDRs appear to be respondents guided by learning
about environmental cues rather than operants reinforced by their con-
sequences), and more cognitive in the sense that what is learned now
appears to be separate from what is shown in behavior.
6. Exposure to uncontrollable aversive events can interfere with subse-
quent escape or avoidance learning (the “learned helplessness effect”).
Although exposure to uncontrollable aversive events has many effects,
one is that organisms may learn that their behavior is independent of
O. If an aversive O is uncontrollable, it also has especially pernicious ef-
fects (e.g., it can cause more fear conditioning and may lead to stomach
ulceration).
7. In appetitive learning (in which organisms learn to respond to earn posi-
tive Os like food), different behaviors are also learned at unequal rates,
and natural behaviors that are elicited by Ss that predict Os can intrude.
Pavlovian learning is always occurring in operant learning situations, and
its impact is difficult to ignore.
8. S-O and R-O learning often work in concert. For example, when an
operant behavior is punished, the organism may stop responding either
because it associates the response with the aversive consequence (R-O)
or because it associates nearby stimuli with that consequence (S-O) and
withdraws from them (negative sign tracking). Conversely, in reward
learning, the organism may respond either because it associates the re-
sponse with reward or because it associates nearby stimuli with reward
and approaches them (positive sign tracking).
9. Animals learn several things in instrumental/operant situations: They as-
sociate their behavior with its consequences (R-O), they associate stimuli
in the environment with those consequences (S-O), they may learn that
stimuli in the environment signal the current relationship between the
behavior and its consequences (occasion setting or S-[R-O]), and they
may learn a simple association between the environmental cues and the
response (S-R).
10. The reinforcer devaluation effect provides clear evidence of R-O learn-
ing: If a rat is taught to press a lever for sucrose and then sucrose is sep-
arately associated with illness, it will press the lever less—even though
the response has never been paired with sucrose after its association
with illness. Thus, the rat learns that lever pressing leads to sucrose and
then responds according to how much sucrose is liked or valued. Incen-
tive learning plays a crucial role in the assignment of value.
11. Organisms can also learn what specific O follows an S. S-O learning
influences instrumental action by motivating the response (see Chap-
ter 9) and by allowing S to evoke behavior directly (including positive
or negative sign tracking). The laws of S-O learning were discussed in
A Synthetic Perspective on Instrumental Action   463

Chapters 3 through 5. Learning about more “complex” stimuli, such as


polymorphous categories and temporal and spatial cues, was reviewed
in Chapter 8; it appears to follow similar rules.
12. S can also “set the occasion” for an R-O relationship (S-[R-O] learning)
in a way that is distinct from simple S-O learning. For example, organ-
isms can learn that S signals that the animal must now perform R to
obtain an O that is otherwise freely available.
13. Organisms may also learn to perform an R reflexively, without regard to
its consequences—out of habit (S-R learning). S-R learning may become
especially important after many repetitions of the instrumental action
and may function to keep working memory free and available for other
activities.

Discussion Questions
1. What are the two factors in Mowrer’s two-factor theory of avoidance
learning? What is the evidence for and against this theory? According
to the theory, what reinforces avoidance behavior? What would you say
is the most modern view of what provides the reinforcer in avoidance
learning?
2. Why is it so hard to punish an SSDR like freezing in a rat? Why is it
so hard to weaken the pigeon’s pecking at a key that signals food by
making each peck prevent the delivery of food (“negative automainte-
nance”)? What are the implications of these findings for understanding
what can be learned, and control behavior, in instrumental learning
situations?
3. Describe learned helplessness and the immunization effect. Discuss
what you think are the most important implications of these phenom-
ena for understanding the development of stress and anxiety, as well as
resilience, in humans.
4. How have experimenters separated the contributions of stimulus learn-
ing (S-O) and response learning (R-O) in punishment?
5. Describe the reinforcer devaluation effect. What does it tell us that
organisms must learn in instrumental (operant) learning situations, and
why might they choose to perform one instrumental behavior over
another? How is this perspective different from the older view that rein-
forcers merely strengthen (stamp in) instrumental behavior?
6. What is the evidence for S-(R-O) learning?
7. What is a habit? How common do you think habits are in your life? How
is a habit different from a goal-directed action? How do experimenters
distinguish between the two? What makes an action turn into a habit?
8. Use what you know about S-O learning, R-O learning, S-(R-O) learn-
ing, and habit learning to come up with a general explanation of why
people overeat food or become addicted to drugs.
464  Chapter 10

Key Terms
adjunctive learned irrelevance  432 schedule-induced
behaviors  439 negative polydipsia  439
immunization effect  432 automaintenance  441 species-specific
interim behaviors  439 recuperative defense reactions
learned helplessness behavior  425 (SSDRs)  420
effect  432 reinforcer devaluation terminal behavior  438
learned helplessness effect  446 two-factor theory  413
hypothesis  432
Glossary
A Analogous  Two or more traits that are similar
A1  In SOP theory, the maximal state to which in function but not in structure or evolutionary
elements in a memory node can be activated origin.
when the corresponding conditional stimulus or Animal cognition  A subfield of learning theory
unconditional stimulus is presented. that examines the cognitive (mental) processes
A2  In SOP theory, a secondary, or lower state and abilities of animals, often by using stimulus
of activation to which elements decay after they control techniques. Sometimes involves com-
have been in A1. A retrieval cue also activates parisons across species.
elements in an associated node to the level of A2. Antecedent  An event that precedes another
Acquired equivalence  See mediated one. Respondent behaviors are responses to
generalization. antecedent events.
Acquired motivation  Motivation that origi- a-process  In opponent-process theory, the pro-
nates from experience with reinforcers or cess underlying the initial emotional response
punishers in instrumental learning tasks. For to a stimulus. Compare to b-process.
example, see incentive motivation. Artificial selection  When humans intervene
Acquisition  The phase in a learning experiment in animal or plant reproduction to ensure that
in which the subject is first learning a behavior desirable traits are represented in successive
or contingency. generations. Individuals with less desirable
Adjunctive behaviors  Stereotyped behaviors, traits are not allowed to reproduce.
such as excessive drinking (schedule-induced Association  A connection or relation between
polydipsia), which may emerge when animals two things, such as sense impressions, ideas,
receive positive reinforcers at regular intervals. stimuli, or stimuli and responses.
Affect  Emotion. Atomistic  Consisting or made up of many
After-image  The visual image seen after a separate elements. The British Empiricists were
stimulus is removed; typically, it is an opposite said to have an atomistic view of the mind
color than the stimulus. because they believed that complex thoughts
resulted from the accumulation of many differ-
After-reaction  The reaction after a stimulus is ent associations.
removed; according to opponent-process theory,
it is typically the opposite of the initial reaction Attentional priming  The finding that recent
to the stimulus. exposures to a stimulus or to cues associated
with that stimulus can decrease the time it takes
Agoraphobia  An abnormal fear and avoidance to find the stimulus when it is presented among
of open or public places that often accompanies distractors.
panic disorder.
466  GLOSSARY

Autoshaping  A form of sign tracking in which a porates economic principles in understanding


keylight that is paired with food elicits pecking operant behavior.
in the pigeon. It has become a popular method Behavioral regulation theory  The view that
for studying classical conditioning. an organism will work to maintain a preferred
Avoidance  An instrumental learning situa- distribution of behavior. See response deprivation
tion in which performing an action or response hypothesis; bliss point.
prevents a noxious or aversive stimulus from Behavioral theory of timing  A theory of
occurring. Involves negative reinforcement. interval timing that proposes that animals use
B changes in their own behaviors to measure the
B. F. Skinner  (1904–1990) Influential 20th- passage of time.
century American psychologist who first Bidirectional response system  An experi-
promoted radical behaviorism and pioneered the mental setup where it is possible to measure
operant experiment and the study of operant both excitation and inhibition because response
conditioning. levels can go either above or below a starting
Backward blocking  The finding (primarily in baseline.
humans) that little or no conditioning occurs to Bliss point  An organism’s preferred distribu-
a conditional stimulus if it is combined, during tion of behavior.
conditioning trials, with another conditional Blocking  In classical conditioning, the finding
stimulus that is later paired with the uncon- that little or no conditioning occurs to a new
ditional stimulus. Backward blocking differs stimulus if it is combined with a previously
from ordinary blocking (i.e., “forward blocking”) conditioned stimulus during conditioning trials.
in that conditioning with the other stimulus Suggests that information or surprise value is
occurs after (rather than before) the compound important in conditioning.
conditioning. b-process  In opponent-process theory, the
Backward conditioning  A classical condi- process underlying an emotional response that
tioning procedure in which the conditioned is opposite the one controlled by the a-process.
stimulus is presented after the unconditioned The b-process functions to compensate for the
stimulus has occurred. Can lead to either no a-process, and starts and then decays relatively
conditioning, conditioned excitation, or condi- slowly. Compare to a-process.
tioned inhibition depending on the timing of British Empiricists (also British Association-
the two stimuli. ists)  British philosophers (including John Locke
Beacon  A cue that is close to a goal that can be and David Hume) who proposed that the mind
detected from a distance and approached. is built up from a person’s experiences.
Behavior chain  A sequence of behaviors that C
is theoretically put together with the help of Categorization  Arranging items into classes or
discriminative stimuli that reinforce the preced- categories. See category learning.
ing behavior and set the occasion for the next
behavior. Category learning  Learning to identify specific
items as members, or not, of a larger group or
Behavior system  A system of behaviors that set of items.
has evolved to optimize interactions with an
unconditional stimulus (or reinforcer) in the Causal learning  Learning about the causes of
natural environment. Behavior systems theories an event.
propose that the behaviors that emerge in clas- Chained schedule  A set of two or more rein-
sical and instrumental conditioning situations forcement schedules, each signaled by its own
originate in such systems. discriminative stimulus, that must be com-
Behavior systems theory  A type of theory pleted in sequence before the primary reinforcer
that proposes that the behaviors that emerge occurs.
in classical and instrumental conditioning situ- Charles Darwin (1809–1882)  British biologist
ations originate in systems of behaviors that who proposed the theory of evolution in his
have evolved to optimize interactions with the 1859 book, On the Origin of Species.
unconditional stimulus (or reinforcer) in the Circadian rhythm  A daily activity cycle, based
natural environment. roughly on 24-hour intervals.
Behavioral economics  An approach that incor- Clark L. Hull (1884–1952)  An influential
GLOSSARY  467

American learning theorist who presented an ferent operant behaviors; each behavior pays off
ambitious theory of learning and motivation according to its own schedule of reinforcement.
that emphasized Drive and Habit. Conditional discrimination  A discrimination in
Classical conditioning  The procedure in which which two stimuli are presented, and the correct
an initially neutral stimulus (the conditional stimulus is determined based on which of the
stimulus, or CS) is repeatedly paired with an two stimuli is present or was presented recently.
unconditional stimulus (or US). The result is Conditional response (CR)  The response that
that the conditional stimulus begins to elicit a is elicited by the conditional stimulus after clas-
conditional response (CR). Nowadays, classical sical conditioning has taken place. The response
conditioning is important as both a behavioral is “conditional” in the sense that it depends on
phenomenon and as a method used to study the conditioning experience.
simple associative learning. Conditional stimulus (CS)  An initially neutral
Comparator theory  A theory of classical stimulus (like a bell, light, or tone) that begins
conditioning which proposes that the strength to elicit a conditional response after it has been
of the response to a conditional stimulus de- paired with an unconditional stimulus.
pends on a comparison of the strength of that Conditioned compensatory response  In
stimulus’s association with the unconditioned classical conditioning, a conditional response
stimulus and that of another stimulus. that opposes, rather than being the same as, the
Complements  Two or more commodities or unconditional response. It functions to reduce
reinforcers that “go together” in the sense that the strength of the unconditional response, as in
increasing the price of one will decrease the drug tolerance.
demand for both of them. For example, chips Conditioned emotional response (CER) A
and salsa; bagels and cream cheese. method for studying classical conditioning in
Compound  In classical conditioning, the which the conditional stimulus is associated
presentation of two or more conditional stimuli with a mild electric shock and the CS comes to
at about the same time. In a “simultaneous” suppress an ongoing behavior, such as lever-
compound, the conditional stimuli are present- pressing reinforced by food. Also called condi-
ed at the same time; in a “serial” compound, the tioned suppression.
stimuli are presented in a sequence. Also called Conditioned food-cup entry  A method for
compound CS. studying classical conditioning in which the
Compound conditional stimulus  A condi- conditional stimulus is associated with a food
tional stimulus that is composed of at least two pellet and the CS comes to elicit approach to
separate conditional stimuli, such as a light and the food-cup where the pellet will be delivered.
a noise. Also called magazine approach. See also goal
Compound potentiation  In classical condition- tracking.
ing, the finding that there is more conditioning Conditioned inhibition  Inhibition that is
to a weak conditional stimulus if it is combined learned through classical conditioning. The
with a more salient conditional stimulus during term also refers to a specific inhibitory condi-
conditioning. Mainly known in flavor aversion tioning procedure in which one conditional
learning, where conditioning of a weak odor stimulus is always paired with an unconditional
may be especially strong if it is combined with a stimulus, except when the CS is combined with
salient taste during conditioning. The opposite a second conditional stimulus. The second
of overshadowing. stimulus acquires inhibition. The procedure is
Compound schedules  A procedure in which also known as the feature-negative discrimination.
two or more schedules operate, such as a mul- Conditioned inhibitor (CS–)  A conditional
tiple schedule or a chained schedule. stimulus that evokes inhibition; e.g., one that
Concurrent measurement studies  Experi- suppresses or reduces the size of the con-
ments in which Pavlovian responses and in- ditioned response that would otherwise be
strumental (or operant) responses are measured elicited by a second conditional stimulus. See
at the same time in order to investigate their retardation-of-acquisition test and summation test.
relationship. Conditioned reflex  Another name for a
Concurrent schedule  A situation in which the conditional response, i.e., the response that is
organism can choose between two or more dif- elicited by a conditional stimulus after classical
468  GLOSSARY

conditioning has taken place. The term “reflex” Continuous reinforcement schedule  A sched-
is used here to connect the concept with the ule of reinforcement in which a reinforcer is
tradition of studying reflexes in physiology. delivered after each response.
Conditioned reinforcer or secondary reinforc- Counterconditioning  A conditioning proce-
er  A stimulus that has acquired the capacity to dure that reverses the organism’s response to a
reinforce behavior through its association with a stimulus. For example, by pairing the stimulus
primary reinforcer. with a positive event, an organism may be con-
Conditioned suppression  See conditioned emo- ditioned to respond positively to a stimulus that
tional response. would otherwise conditionally or uncondition-
Conditioning preparation  Any of several ally elicit fear.
methods for studying classical conditioning. Cumulative record  A graph in which the cu-
Configural cue  The unique new stimulus that mulative number of operant responses is plot-
is present when two or more conditional stimuli ted as a function of time. The slope of the line
are combined. gives the rate of responding. Usually created by
a cumulative recorder.
Configural theory  A theory that assumes that,
when organisms receive classical conditioning Cumulative recorder  A device used to analyze
with a compound conditional stimulus, they operant behavior in which a pen that rides on
associate the entire compound with the uncon- a slowly-moving piece of paper is deflected
ditional stimulus rather than forming separate upward with each response (press of a lever,
associations between each of its elements and for example). This creates a graph or cumulative
the unconditional stimulus. record which shows the cumulative number of
responses as a function of time.
Connectionism  An approach in cognitive
psychology and artificial intelligence in which D
knowledge is represented by a large number David Hume  (1711–1776) One of the British
of connections between nodes or units in a Empiricists.
network that bears a metaphorical resemblance Dead reckoning  A method of navigation in
to connections in the brain. Also called parallel which an animal travels to its goal by using an
distributed processing or neural networks. internal sense of direction and distance.
Connections  Associations. Declarative memory  Memory for things other
Consequence  Something that follows from an than actual behavioral procedures.
action. Operant behaviors are actions that are Delay conditioning  A classical conditioning
controlled by their consequences (such as the procedure in which the conditional stimulus
reinforcers or punishers they might produce). commences on its own and then terminates
Consolidation  The biological process by which with presentation of the unconditional stimulus.
a memory is stored in long-term memory. Delay discounting  The decrease in the subjec-
Context or contextual stimuli  External or tive value of a reinforcer that occurs when the
internal stimuli that are in the background reinforcer is delayed in time.
whenever learning or remembering occurs. Delayed matching-to-sample (DMTS)  A
Contiguity theory  Guthrie’s idea that learning procedure used to study working memory in
depends on a stimulus and response occur- which the organism is reinforced for responding
ring together in time rather than depending on to a test stimulus if it is the same as a “sample”
reinforcement. stimulus presented earlier.
Contingency  The “if-then” relationship be- Demand curve  A graph showing the demand
tween two events. See positive contingency and for a product at different prices. In behavioral
negative contingency. economics, the amount of a commodity (or
Contingency management  Behavioral reinforcer) that is taken when the experimenter
treatment of unwanted behavior in humans varies the amount of work that is required to
that works by manipulating the contingency earn it.
between the behavior (and its alternatives) Differential inhibition or discriminative inhibi-
and reinforcement. For example, smoking can tion  A procedure in classical conditioning in
be decreased if the smoker is reinforced with which a conditional stimulus is paired with the
vouchers or prizes for abstaining from smoking. unconditional stimulus on some trials and an-
GLOSSARY  469

other conditional stimulus is presented without many important experiments that emphasized
the unconditional stimulus on other trials. The cognitive and motivational factors in behavior
second CS may acquire inhibition. and learning.
Discriminative stimulus  In operant condition- Elemental theory  A theory that assumes that
ing, a stimulus that signals whether or not the when organisms receive conditioning with a
response will be reinforced. It is said to “set the compound conditional stimulus, they associate
occasion” for the operant response. each element of the compound separately with
Dishabituation  Recovery or return of a habitu- the unconditional stimulus.
ated response that is observed when the re- Elicited  Brought on by something that comes
sponse is tested with its original stimulus again before. Respondent behaviors are elicited by an
after exposure to a different stimulus. antecedent event.
Drive  A theoretical construct that corresponds Emitted  Literally, “to send forth.” Organisms
to motivation arising from biological needs, are said to emit operant behaviors in the sense
such as the need for food or water. that such behaviors are not elicited by an ante-
Drug tolerance  A reduction in the effective- cedent event; they appear spontaneous (but are
ness of a drug that can occur with repeated really controlled by their consequences).
exposure to the drug. Episodic memory  Memory for personal, often
Dual-process theory of habituation  A theory autobiographical, experiences and events
of habituation that states that the repeated pre- that typically involve what, where, and when
sentation of a stimulus engages two underlying information.
processes. One process reduces responding to Escape  An instrumental learning situation in
the stimulus (habituation). The other process is which performing an action or response ter-
arousing and increases responding to the stimu- minates a noxious or aversive stimulus that is
lus (sensitization). The actual response one already present. Involves negative reinforcement.
observes to a stimulus is the net effect of both of Ethology  The study of how animals behave in
these processes. their natural environments, typically with an
Dual-process view  The idea that humans learn emphasis on the evolution of the behavior.
both propositions and simple associations, and Exaptation  A trait that has adaptive value
that these are not necessarily the same. Thus, in but was not originally selected for its current
a classical conditioning experiment, the human function.
might learn that “the conditional stimulus leads Excitation  In classical conditioning, the po-
to the unconditional stimulus” as well as form a tential of a conditional stimulus to signal an
simple association between the CS and US. See unconditional stimulus or elicit a conditional
Proposition learning. response.
E Excitor (CS+)  A conditional stimulus that is
Early comparative psychologists  A group associated with an unconditional stimulus,
of primarily British biologists (e.g., C. Lloyd and has the potential to elicit a conditional
Morgan and George Romanes) who were active response.
in the late 1800s and who sought to study the Exemplar theory  An approach to categori-
evolution of the mind by inferring the mental zation which assumes that organisms store
activities of animals from their behavior. representations of a large number of individual
Edward L. Thorndike  (1874–1949) American members of a category and then respond to new
psychologist whose experiments with cats items depending on how similar they are to the
learning to get out of puzzle boxes profoundly items that were presented before.
influenced our thinking about the importance of Explicitly unpaired  In classical conditioning,
instrumental conditioning and the central place of a procedure in which a conditional stimulus is
animal learning experiments in psychology. presented alone and the unconditional stimulus
Edward C. Tolman  (1886–1959) American is presented at another time.
psychologist whose ideas about the value and Exposure therapy  A form of cognitive behav-
scientific validity of using intervening variables ior therapy in which a patient is exposed, with-
to explain behavior had a profound impact on out consequence, to stimuli that elicit undesir-
all of scientific psychology. Tolman also ran able cognitions, emotions, or behaviors in order
470  GLOSSARY

to weaken their strength. A form of either extinc- is presented with the unconditional stimulus
tion (if the undesirable responses were learned) on some trials and without the unconditional
or habituation (if the undesirable responses were stimulus on other trials. A second conditional
not learned). stimulus is added to signal when the uncondi-
External inhibition  Weakening of a conditional tional stimulus will occur.
response elicited by a conditional stimulus Fitness  An individual’s ability to survive and
when a neutral stimulus is added. Usually reproduce in a particular environment—and to
thought to occur through generalization decre- have offspring that will survive and reproduce.
ment; that is, the organism does not generalize Fixed action pattern  An innate sequence of
well between the conditional stimulus alone behaviors that is triggered by a specific stimu-
and its combination with the second stimulus. lus and continues to its end without regard to
Extinction  Reduction in the strength or prob- immediate consequences or feedback.
ability of a learned behavior that occurs when Fixed interval schedule  A schedule of rein-
the conditional stimulus is presented without forcement in which the first response after a
the unconditional stimulus (in classical con- fixed amount of time has elapsed (since the last
ditioning) or when the behavior is no longer reinforcer) is reinforced.
reinforced (in operant or instrumental condi- Fixed ratio schedule  A schedule of reinforce-
tioning). The term describes both the procedure ment in which a fixed number of responses is
and the result of the procedure. Behaviors that required for the delivery of each reinforcer.
have been reduced in strength through extinc-
tion are said to be “extinguished.” Focal sets  In probabilistic contrast theory, the
idea that the contingency between two events is
F calculated over a relevant subset of the trials.
Fading  A procedure in which a prompt or Frustration  Motivational response that occurs
discriminative stimulus for a desired behavior when a reward is smaller than expected.
is gradually withdrawn so that the organism is
able to emit the behavior without the prompt. G
Fear potentiated startle  An exaggerated startle General Pavlovian-instrumental transfer
reaction to a sudden stimulus that occurs when (general PIT)  A form of Pavlovian-instrumen-
the stimulus is presented while the organism is tal transfer in which the conditional stimulus
afraid, e.g., in the presence of a fear excitor. influences the rate of an ongoing instrumental
behavior that is associated with a different
Feature stimulus  In feature-positive and feature- reinforcer that is from the same general motiva-
negative discriminations, the second conditional tional system. See Pavlovian-instrumental transfer
stimulus that is added to the other (target stimu- and outcome-specific Pavlovian instrumental
lus) conditional stimulus to signal trials on which transfer.
the unconditional stimulus will or will not occur.
Generalization  The transfer of a learned re-
Feature theory  An approach to categorization sponse from one stimulus to a similar stimulus.
which assumes that organisms associate the
many features of category exemplars with rein- Generalization decrement  A decrease in the
forcers (or category labels) and then respond to transfer of a learned response from one stimu-
new items according to the combined associa- lus to another (i.e., generalization) if the two
tive strengths of their features. Learning rules stimuli are made to be different.
like the Rescorla-Wagner model would tend to Generalize  To respond to a new stimulus to the
isolate the most predictive features. extent that it is similar to another stimulus that
Feature-negative discrimination  A condition- has been reinforced or trained.
ing procedure in which a conditional stimulus Geometric module  A representation of the
is presented with the unconditional stimulus global shape of the environment that is thought
on some trials and without the unconditional to be separate from the representations of indi-
stimulus on other trials. A second conditional vidual landmarks.
stimulus is added to signal when the uncondi- Geons  Short for geometric ions; primitive
tional stimulus will not occur. See also condi- components of visual perception according to
tioned inhibition. recognition by components theory.
Feature-positive discrimination  A condition- Goal tracking  Movement toward the site
ing procedure in which a conditional stimulus where a positive unconditional stimulus will be
GLOSSARY  471

delivered during presentation of a conditional Incentive learning  A process in which organ-


stimulus that signals the US. isms learn about the value of a specific reinforcer
while they are in a particular motivational state.
H
Habituation  A decrease in the strength of a Incentive motivation  Motivation for instru-
naturally elicited behavior that occurs through mental behavior created by anticipation of a
repeated presentations of the eliciting stimulus. positive reinforcer. See also rG-sG mechanism.
Hall-Pearce negative transfer  Interference Independents  Two or more commodities or
with conditioning that is produced by pairing reinforcers that do not “go together” in the
a conditional stimulus with a weak uncondi- sense that increasing the price of one causes its
tional stimulus before pairing it with a stronger consumption to decrease without changing con-
unconditional stimulus. sumption of the other. Umbrellas and compact
disks, for example.
Hedonic shift  The observation that in taste
aversion learning, the flavor conditional stimu- Information processing  A model of cognition,
lus actually becomes unpleasant. based on a computer metaphor, in which the
organism receives sensory input from the envi-
Hedonism  The pursuit of pleasure and the ronment and then proceeds to operate on that
avoidance of pain. information through a sequence of activities in
Hidden units  Nodes or units in a connection- sensory memory, short-term memory (work-
ist network that come between the input and ing memory), and long-term memory (reference
output units and usually have no other connec- memory).
tions outside the network (and are thus are not Inhibition  In classical conditioning, the capacity
“visible” to outside systems). of a conditional stimulus to signal a decrease in
Homeostasis  The tendency of an organism to the probability of the unconditional stimulus.
maintain an internal equilibrium. More generally, an active process that suppresses
Homologous  Two or more traits that are simi- excitation or reduces the strength of a response.
lar in structure and evolutionary origin. Inhibition of delay  In classical conditioning,
Hybrid attentional models  Models of classical inhibition that develops to the early portion
conditioning that acknowledge that organisms of a conditional stimulus in a delay condition-
pay attention to conditional stimuli that are ing procedure. The early part of a conditional
either good predictors of an outcome (uncondi- stimulus signals a period without the uncondi-
tional stimulus) or whose outcome is uncertain. tional stimulus.
I Inhibitor (CS–)  A conditional stimulus that
Immanuel Kant  (1724–1804) German philoso- signals a decrease in the probability or intensity
pher who thought that the mind comes into of the unconditional stimulus and therefore
the world with certain inborn assumptions or evokes inhibition.
predilections with which it molds experience. Instrumental conditioning or instrumental
Immunization effect  The finding that exposure learning  Any situation based on Thorndike’s
to escapable shocks before exposure to ines- method in which animals can learn about
capable shocks can protect an animal from the the relationship between their actions and
learned helplessness effect. consequences. Essentially the same as operant
conditioning, except that in instrumental learn-
Imprinting  Learning in very young organisms ing experiments the experimenter must set up
that establishes attachment to a parent (or an each and every opportunity the organism has to
object identified as a parent; sometimes called respond.
“filial imprinting”). In “sexual imprinting,”
a similar process may influence later sexual Interference  Memory impairment caused by
behavior. conflicting information that was learned at
some other time.
Inactive  Resting state of a memory representa-
tion or node. In SOP theory, it is the final state Interim behaviors  Stereotyped behaviors that
to which elements in a node decay after they occur early in the interval between regularly
have been in A1 and then A2. delivered reinforcers.
Incentive-based treatment  See contingency Internal clock  A hypothetical cognitive device
management. that codes or represents the passage of time.
472  GLOSSARY

Intertrial interval  The period of time between to uncontrollable (inescapable) electric shock.
two successive trials. Learned helplessness hypothesis  The theoret-
Interval schedule  A schedule of reinforcement ical idea that organisms exposed to inescapable
in which a response is reinforced only if it oc- and unavoidable shocks learn that their actions
curs after a set amount of time has elapsed since do not control environmental outcomes.
the last reinforcer. Learned irrelevance  In classical conditioning,
Intervening variable  A theoretical concept the finding that when there is no contingency
that cannot be observed directly, but is used in between a CS and a US in an initial phase,
science to understand the relationship between animals have difficulty learning an association
independent and dependent variables. To be between the two events when the events are
scientific, intervening variables must be care- later paired.
fully defined in terms of the events that lead to Learning theory  The modern field in which
them and the behavioral outputs they lead to. principles of learning, cognition, and behavior
Also known as theoretical constructs. are investigated by studying animals learning
Ivan Pavlov  (1849–1936) Russian physiologist under controlled laboratory conditions.
who published the first systematic observations Learning/performance distinction  The idea
of classical conditioning (also known as Pavlovian that learning is not the same as performance,
learning) and introduced many of the terms that and that behavior may not always be an accu-
are still used to describe such conditioning today. rate indicator of knowledge.
J Long-delay learning  Conditioning that occurs
John Locke  (1632–1704) One of the British when there is a long period of time between
Empiricists. the conditional stimulus and the unconditional
Julien de la Mettrie  (1709–1751) French writer stimulus.
who believed that the body affects the mind. Long-term memory  A theoretical part of
memory that has a very large capacity and can
L retain information over long periods or reten-
Landmark  A cue that has a fixed relationship tion intervals. Also used to characterize situa-
with a goal, but is not close to it, which organ- tions in which an experience has a long-lasting
isms learn about and use to get around in space. effect on behavior.
Latent inhibition or CS-preexposure effect 
M
Interference with conditioning that is produced
by repeated exposures to the conditional stimu- Magazine approach  See conditioned food-cup
lus before conditioning begins. entry.
Latent learning experiment  An experiment Massed trials  Conditioning trials separated by
by Tolman and Honzik (1930) in which animals a short intertrial interval.
were not rewarded during initial trials, and Matching law  A principle of choice behavior
then were rewarded for correct responding in which states that the proportion of responses di-
a second phase. After the first rewarded trial, rected toward one alternative will equal (match)
the rats began responding efficiently, as if they the percentage of reinforcers that are earned by
had previously been learning without reward. performing that alternative.
Although the reward was not necessary for Matching-to-sample  A procedure in which
learning, it did appear necessary to motivate the organism is reinforced for responding to
performance. a test stimulus if it is the same as a “sample”
Law of effect  Originally, Thorndike’s idea that stimulus.
responses that are followed by pleasure will be McCollough effect  In color perception, the
strengthened and those that are followed by evocation of an opposite-color after-image by
discomfort will be weakened. Nowadays, the black-and-white stimuli that have been associ-
term refers to the idea that operant or instru- ated with a color.
mental behaviors are lawfully controlled by Mediated generalization  Treating two stimuli
their consequences. as alike not because they are physically similar
Learned helplessness effect  Interference with but because they are associated with a common
learning a new instrumental action, typically an stimulus.
escape response, that is produced by exposure Melioration  An explanation of matching which
GLOSSARY  473

claims that the organism will always respond if the unconditional stimulus is less probable
so as to improve the local rate of reinforcement. when the conditional stimulus has occurred,
This ultimately leads to a steady state of behav- the conditional stimulus becomes a conditioned
ior that matches the rates of reinforcement on inhibitor. In instrumental conditioning, a bio-
the two alternatives. logically significant event may likewise be less
Memory reactivation  Restoration of forgot- probable if a behavior occurs. If the significant
ten information after reexposure to part of the event is negative or aversive, then escape or
learning situation. avoidance learning occurs; if the significant event
Minimum distance model  A model of operant is positive, it is called omission. Also called
behavior which states that when given any rein- negative correlation.
forcement schedule, the organism will respond Negative contrast effect  When “expectation”
in a way that gets it as close as possible to the of a large positive reward decreases the positive
bliss point. reaction to a smaller positive reward.
Modulation  When a stimulus influences be- Negative correlation  See negative contingency.
havior by increasing or decreasing the response Negative occasion setter  In classical con-
evoked by another stimulus, rather than by ditioning, a type of modulator that decreases
eliciting a response itself. the response evoked by another conditional
Modules  Hypothetical specialized cognitive stimulus in a way that does not depend on the
mechanisms that have evolved to deal with modulator’s direct inhibitory relation with the
information in a restricted domain. unconditional stimulus.
Morgan’s Canon  A law proposed by C. Lloyd Negative patterning  In classical conditioning,
Morgan which states that a behavior should a procedure in which two conditional stimuli
always be explained by the simplest mental are paired with an unconditional stimulus
process possible (also known as the law of when they are presented alone, but occur
parsimony). without the unconditional stimulus when they
Multiple oscillator model  A model of inter- are combined. It is difficult for an elemental
val timing that represents time in terms of the theory to explain why an organism can respond
status of a set of hypothetical units that cycle accordingly.
between different values, each with a different Negative reinforcement  A situation in which
fixed period over time. an operant behavior is strengthened (“rein-
Multiple schedule  A procedure in which two forced”) because it removes or prevents a nega-
or more reinforcement schedules, each signaled tive (aversive) stimulus.
by its own discriminative stimulus, are present- Negative sign tracking  Movement away from
ed one at a time and alternated. a stimulus that signals either an aversive event
Multiple-time-scale model  A model of interval or the reduced probability of a positive event.
timing which assumes that the start of a trial Negative transfer  When learning one task
is recorded in short-term memory and then interferes with learning or performance of a
gradually fades over time. Animals time events second task.
by associating them with the strength of this Network  A set of interconnected memory nodes.
memory at a given point in time. Neural networks  See connectionism.
N Nodes  Memory representations of items in the
Natural selection  A process that allows indi- world.
viduals with certain features to leave more off- O
spring in the next generation; typically, individ- Occasion setter  In classical conditioning, a
uals without those features are less successful. stimulus that may not itself elicit a response,
Negative automaintenance  The finding that but modulates behavior to another stimulus.
pecking at a keylight conditional stimulus in pi- Omission  An instrumental or operant con-
geons may persist even when the peck prevents ditioning procedure in which the behavior
the reinforcer from occurring. prevents the delivery of a positive (reinforcing)
Negative contingency  A situation where the stimulus. The behavior typically decreases in
probability of one event is lower if another strength.
event has occurred. In classical conditioning,
474  GLOSSARY

Operant  A behavior that is controlled by its ing, the finding that two conditional stimuli that
consequences. The canonical example is the have been separately paired with an uncondi-
rat’s lever-pressing, which is controlled by the tional stimulus may actually lose some of their
food-pellet reinforcer. potential to elicit conditional responding if they
Operant conditioning  Any situation based on are combined and the compound is paired with
Skinner’s setup in which an organism can learn the same unconditional stimulus.
about its actions and consequences. The same Overshadowing  In classical conditioning, the
as instrumental conditioning except that in an finding that there is less conditioning to a weak
operant conditioning experiment the organism conditional stimulus if it is combined with a
is “free” to make the operant response (e.g., more salient conditional stimulus during condi-
lever-pressing) as often as it “wants” to. tioning trials.
Operant experiment  An experimental ar- P
rangement in which a reinforcer (such as a Panic disorder  A psychological disorder char-
food pellet) is made contingent upon a certain acterized by recurrent panic attacks and the fear
behavior (such as lever-pressing). of having additional ones.
Operant-respondent distinction  Skin- Paradoxical reward effects  Any of several
ner’s distinction between operant behavior, behavioral effects in which exposure to nonre-
which is said to be emitted and controlled by inforcement appears to increase the strength of
its consequences, and respondent behavior, instrumental behavior (as in the partial rein-
which is said to be elicited and controlled by its forcement extinction effect), or exposure to larger
antecedents. reinforcers appears to decrease the strength of
Operational behaviorism  An approach, started instrumental behavior (as in the “magnitude of
by Edward Tolman, which departs from radical reinforcement extinction effect”). Often involves
behaviorism by using unobservable intervening frustration.
variables (theoretical constructs) in the explanation Parallel distributed processing  See
of behavior. The approach is scientific as long as connectionism.
the theoretical constructs are carefully defined
and falsifiable. It is the approach generally ac- Partial reinforcement extinction effect
cepted by most modern scientific psychologists. (PREE)  The finding that behaviors that are
intermittently reinforced are more persistent
Opponent process  A more general term for (take longer to extinguish) than behaviors that
the type of compensatory process exemplified are reinforced every time they occur.
by the b-process in opponent-process theory.
Pavlovian-instrumental transfer  An effect
Opponent-process theory  A theory that in which a Pavlovian conditional stimulus is
emphasizes the fact that emotional stimuli often shown to influence the rate of an ongoing in-
evoke an initial emotional reaction followed by strumental behavior if the conditional stimulus
an after-reaction of the opposite valence. With is presented while the organism is engaged in
repeated exposure to the emotional stimulus, that behavior.
the after-reaction grows and the initial reaction
weakens, which may fundamentally change Peak procedure  A method for studying timing
the motivation behind instrumental behavior processes in which the first response after a fixed
controlled by positive and negative stimuli. interval after the start of a signal is reinforced.
Response rate as a function of time in the signal
Ordinal prediction  A hypothesis that specifies is used to assess the accuracy of timing.
a greater-than or less-than relationship between
two conditions or two groups. Peak shift  In discrimination learning, a change
in the generalization gradient surrounding S+
Outcome-specific Pavlovian-instrumental such that the highest level of responding moves
transfer (outcome-specific PIT)  A form of away from S+ in a direction away from the S–.
Pavlovian-instrumental transfer in which the
conditional stimulus specifically influences the Perceptual learning  An increase in the dis-
rate of an ongoing instrumental behavior that is criminability of two stimuli that results from
associated with the same reinforcing outcome. simple exposure to the two stimuli.
See Pavlovian-instrumental transfer and general Perruchet effect  In classical conditioning,
Pavlovian-instrumental transfer. the finding that humans given a series of trials
Overexpectation effect  In classical condition- in which a conditional stimulus is sometimes
GLOSSARY  475

paired with an unconditional stimulus and predator, which can determine the form of its
sometimes not show more conditioned re- anti-predator response (or species-specific defense
sponding after a series of CS-US trials and less reaction).
conditioned responding after a series of CS-only Prediction error  Any difference between what
trials, but verbally predict the opposite (i.e., say is predicted to occur on a conditioning trial and
that the US is less likely to occur after a string what actually occurs. For example, prediction
of CS-US trials and more likely to occur after a error is present if the unconditional stimulus is
string of CS-only trials). Suggests that humans bigger or smaller on a conditioning trial than
learn both associations and propositions during what the conditional stimuli that are present
conditioning, and that the two types of learning predict. The conditioning process adjusts as-
can be independent. See Proposition learning and sociative strengths over trials so as to correct or
Dual-process view. reduce the prediction error. See Surprisingness of
Place cells  Cells in the rat hippocampus that the US and Surprisingness of the CS.
become active when the animal is in a particular Premack principle  The idea that reinforcement
location. is possible when a less-preferred behavior will
Positive contingency  A situation where the allow access to a more-preferred behavior.
probability of one event is higher if another Preparedness  The extent to which an organ-
event has occurred. In classical conditioning, ism’s evolutionary history makes it easy for the
if the unconditional stimulus is more probable organism to learn a particular association or
when the conditional stimulus has occurred, the response. If evolution has made something easy
conditional stimulus becomes a conditioned exci- to learn, it is said to be “prepared.”
tor. In instrumental conditioning, a biologically Primary reinforcer  An event that uncondition-
significant event may likewise be more prob- ally reinforces operant behavior without any
able if a behavior occurs. If the significant event particular training.
is negative or aversive, then punishment occurs;
if the significant event is positive, then reward Primed  When a node or representation has
learning occurs. been activated in short-term memory.
Positive contrast effect  “Expectation” of a Proactive interference  Memory impairment
small positive reward can increase the positive caused by information learned or presented
reaction to a larger positive reward. before the item that is to be remembered.
Positive occasion setter  In classical condition- Probabilistic contrast model  A model devel-
ing, a type of modulator that increases the re- oped to explain associative learning in humans
sponse evoked by another conditional stimulus that computes contingencies between events by
in a way that does not depend on the modula- defining and comparing the probability of an
tor’s direct association with the unconditional event in the presence and absence of selected cues.
stimulus. Procedural memory  Memory for how to
Positive patterning  In classical conditioning, automatically execute or perform a particular
a procedure in which two conditional stimuli behavioral or cognitive task.
are presented with the unconditional stimulus Proposition learning  In classical conditioning,
when they are presented together, but with- the learning of a verbal relation, such as “the
out the unconditional stimulus when they are conditional stimulus causes the unconditional
presented alone. stimulus to occur” or “the conditional stimulus
Positive reinforcement  An instrumental or leads to the unconditional stimulus,” in contrast
operant conditioning procedure in which the to merely associating the CS and the US.
behavior is followed by a positive stimulus or Prospective code  Memory held in working
reinforcer. The behavior typically increases in memory about what to do (or what will come)
strength. next. Compare to retrospective code.
Pre-commitment strategy  A method for Protection from extinction  In classical
decreasing impulsiveness and increasing self- conditioning, the finding that extinction trials
control in which the individual makes choices with a conditioned excitor may be ineffective at
well in advance. reducing conditional responding if the excitor is
Predatory imminence  An organism’s per- combined with a conditioned inhibitor during
ceived spatial or temporal proximity to a extinction.
476  GLOSSARY

Prototype  Representation of what is typical or Recuperative behaviors  Behaviors, such as


average for a particular category. licking a wound, which are elicited by tissue
Prototype theory  An approach to categoriza- damage and function to promote healing.
tion which assumes that organisms learn what Reference memory  Another name for long-
is typical or average for a category and then term memory.
respond to new exemplars according to how Reflex action  A mechanism through which
similar they are to the average. a specific environmental event or stimulus
Pseudoconditioning  A process whereby a elicits a specific response. Originated from Rene
conditional stimulus can evoke responding Descartes.
because the organism has merely been exposed Reinforcement  An instrumental or operant
to the unconditional stimulus, rather than true conditioning procedure in which the behavior’s
associative learning. consequence strengthens or increases the prob-
Punisher  An aversive stimulus that decreases ability of the response. See positive reinforcement
the strength or probability of an operant be- and negative reinforcement.
havior when it is made a consequence of the Reinforcement theory  A phrase used to
response. describe learning theories, like Thorndike’s,
Punishment  An instrumental or operant which assume that reinforcement is necessary
conditioning procedure in which the behavior for learning.
is followed by a negative or aversive stimulus. Reinforcer  Any consequence of a behavior that
The behavior typically decreases in strength. strengthens the behavior or increases the prob-
Q ability that the organism will perform it again.
Quantitative law of effect  A more general, Reinforcer devaluation effect  The finding that
but still quantitative, statement of the matching an organism will stop performing an instru-
law in which an operant response is viewed as mental action that previously led to a reinforcer
being chosen over all other potential responses. if the reinforcer is separately made undesirable
through association with illness or satiation.
R Reinforcer substitutability  See substitutability.
Radial maze  An elevated maze that has a central
area from which arms extend in all directions. Reinstatement  Recovery of the learned
response in either classical or instrumental con-
Radical behaviorism  The type of behaviorism ditioning when the unconditional stimulus or
identified with B. F. Skinner which emphasizes reinforcer is presented alone after extinction.
the exclusive study of external events, such as
observable stimuli and responses, and avoids Relapse  The return of undesirable cogni-
any inferences about processes inside the tions, emotions, or behaviors after apparent
organism. improvement.
Rapid reacquisition  In classical conditioning, Relative validity  In classical conditioning, an
the quick return of an extinguished conditional experimental design and result that supports
response when the conditional stimulus and the view that conditioning is poor when the
unconditional stimulus are paired again. In conditional stimulus is combined with a better
instrumental conditioning, the quick return of predictor of the unconditional stimulus.
extinguished behavior once the response and Releaser or releasing stimulus  A specific
reinforcer are paired again. stimulus that elicits a fixed action pattern. Also
Ratio schedule  A schedule of reinforcement in called sign stimulus.
which the delivery of each reinforcer depends René Descartes  (1596–1650) French philoso-
on the number of responses the organism has pher and mathematician who distinguished be-
performed since the last reinforcer. tween mind and body, and also discussed reflex
Rationalism  Term used to refer to Kant’s school action as a mechanical principle that controls the
of thought, in which the mind was thought to activity of the body.
act on experience with a set of inborn predilec- Renewal effect  Recovery of responding
tions and assumptions. that occurs when the context is changed after
Reconsolidation  A process in which a consoli- extinction. Especially strong when the con-
dated memory that has recently been reacti- text is changed back to the original context of
vated is consolidated again. conditioning.
GLOSSARY  477

Respondent  A behavior that is elicited by an Schedule-induced polydipsia  Excessive drink-


antecedent stimulus. ing that is observed if animals are given food
Response deprivation hypothesis  The idea reinforcers at regular intervals.
that restricting access to a behavior below its Search image  An attentional or memory
baseline or preferred level will make access to mechanism that helps predators search for
that behavior a positive reinforcer. specific cryptic prey.
Response form  The qualitative nature of the Secondary reinforcer  See conditioned reinforcer.
conditional response. Determined by both the Second-order or higher-order conditioning  A
unconditional stimulus and by the nature of the classical conditioning procedure in which a
conditional stimulus. conditional response is acquired by a neutral
Response learning  See R-O learning. stimulus when the latter is paired with a stimu-
Retardation-of-acquisition test  A test proce- lus that has previously been conditioned.
dure that identifies a stimulus as a conditioned Self-generated priming  Activation of an item,
inhibitor if it is slower than a comparison stimu- node, or representation in short-term memory
lus to acquire excitation when it is paired with that occurs when the item is presented.
an unconditional stimulus. Semantic memory  A subset of declarative
Retrieval failure  Inability to recover informa- memory that corresponds to memory for vari-
tion that is stored in long-term memory. A com- ous invariant facts about the world.
mon cause of forgetting. Sensitization  An increase in the strength of
Retrieval-generated priming  Activation of an elicited behavior that results merely from
an item, node, or representation in short-term repeated presentations of the eliciting stimulus.
memory that occurs when a cue that is associ- Sensory preconditioning  A classical condition-
ated with that item is presented. ing procedure in which two neutral stimuli are
Retroactive interference  Memory impairment first paired with each other, and then one of
caused by information learned or presented them is paired with an unconditional stimulus.
after the item that is to be remembered. When the other neutral stimulus is tested, it
Retrospective code  A memory held in work- evokes a conditional response, even though it
ing memory about what stimuli have occurred was never paired with the unconditional stimu-
previously. Compare to prospective code. lus itself.
Reward learning  An instrumental or operant Sequential theory  A theory of the partial
conditioning procedure in which the behavior reinforcement extinction effect that suggests that
is followed by a positive event. The behavior extinction is slow after partial reinforcement be-
typically increases in strength. cause the behavior has been reinforced while the
rG-sG mechanism  A theoretical process that organism remembers recent nonrewarded trials.
allowed Hull, Spence, and others to explain in Shaping or shaping by successive approxima-
S-R terms how “expectations” of reward moti- tions  A procedure for training a new operant
vate instrumental responding. behavior by reinforcing behaviors that are
R-O learning  Another term used to describe closer and closer to the final behavior that is
instrumental and operant conditioning that desired.
emphasizes the theoretical content of that learn- Short-term memory  A theoretical part of
ing (an association between a behavior, R, and a memory that has a small capacity and can retain
biologically significant outcome, O). information only briefly. Also used to character-
ize situations in which an experience has only a
S short-lasting effect on behavior.
Scalar property  A property of interval timing Sign stimulus  See releaser.
in which the probability of responding is a
similar function of the proportion of time in the Sign tracking  Movement toward a stimulus
interval being timed, regardless of the actual that signals a positive event or the reduced
duration of that interval. probability of a negative event.
Schedule of reinforcement  A relationship Simultaneous conditioning  In classical
between an operant behavior and its conse- conditioning, a procedure in which the condi-
quences or payoff. See ratio, interval, and concur- tional stimulus and unconditional stimulus are
rent schedules. presented at the same time.
478  GLOSSARY

Skinner box  An experimental chamber that controlled by the stimuli that precede them.
provides the subject something it can repeat- Stimulus elements  Theoretical stimuli or fea-
edly manipulate, such as a lever (for a rat) or a tures that make up more complex stimuli.
pecking key (for a pigeon). The chamber is also Stimulus generalization  See generalization.
equipped with mechanisms that can deliver a
reinforcer (such as food) and other stimuli (such Stimulus generalization gradient  A charac-
as lights, noises, or tones). teristic change in responding that is observed
when organisms are tested with stimuli that dif-
S-O learning  Another term to describe classical fer in increasing and/or decreasing steps from
or Pavlovian conditioning that emphasizes the the stimulus that was used during training.
theoretical content of that learning (an associa-
tion between a stimulus, S, and a biologically Stimulus learning  See S-O learning.
significant outcome, O). Stimulus relevance  The observation that learn-
Sometimes opponent process (SOP)  In SOP ing occurs more rapidly with certain combina-
theory, the idea that a memory node that is in tions of conditional and unconditional stimuli
A2 can sometimes evoke a response that is op- (such as a taste and illness) than with other
posite to the response that is evoked when the stimulus combinations (such as taste and shock).
node is in A1. Stimulus sampling theory  A mathematical
SOP theory  A theory of classical conditioning theory proposed by Estes which extended
that emphasizes activation levels of elements Guthrie’s idea of stimulus elements.
in memory nodes corresponding to conditional Stimulus substitution  In classical condition-
stimuli and unconditional stimuli, especially as ing, the idea that the conditional stimulus is
the activation levels change over time. associated with the unconditional stimulus and
Spaced trials  Conditioning trials separated by becomes a substitute for it (eliciting the same
a long intertrial interval. response).
Species-specific defense reactions Structuralism  A school of psychology, espe-
(SSDRs)  Innate reactions that occur when an cially active in the late 1800s and early 1900s,
animal encounters a predator or a conditional which relied on introspection as a method for
stimulus that arouses fear. They have probably investigating the human mind.
evolved to reduce predation. Examples are Substitutability  A way of conceptualizing the
freezing and fleeing. relationships between different reinforcers or
Specific hungers  The tendency for animals to commodities as substitutes, complements, and
seek and prefer certain foods that might contain independents.
specific nutrients they are currently deprived of. Substitutes  Two or more commodities or rein-
Spontaneous recovery  The reappearance, forcers that can replace or be exchanged for one
after the passage of time, of a response that had another, as demonstrated when increasing the
previously undergone extinction. Can occur price of one of them will decrease the consump-
after extinction in either classical or instrumen- tion of it and increase demand for the other. For
tal conditioning. example, Coke and Pepsi.
S-R learning  The learning of an association Successive negative contrast  A negative con-
between a stimulus and a response. trast effect in which exposure to a large positive
reward decreases the subsequent positive reac-
S-S learning  The learning of an association tion to a smaller positive reward than would
between two stimuli. ordinarily be observed.
Standard operating procedures (SOP)  An Summation test  A test procedure in which
established procedure to be followed in carry- conditional stimuli that are conditioned
ing out a given operation or in a given situation. separately are then combined in a compound.
In SOP theory of classical conditioning, the The procedure can identify a stimulus as a
standard dynamics of memory. conditioned inhibitor if it suppresses responding
Standard pattern of affective dynamics  Ac- evoked by the other stimulus (and does so more
cording to opponent process theory, the charac- than a comparison stimulus that might reduce
teristic sequence of responses elicited by a novel responding through generalization decrement).
emotional stimulus. Superposition  The common finding in research
Stimulus compound  See compound. on interval timing that responding as a function
Stimulus control  When operant behaviors are of the proportion of the interval being timed is
GLOSSARY  479

the same regardless of the duration of the actual stimuli with intermediate durations is tested,
interval being timed—the curves appear identi- the middle point (the duration at which the
cal when they are plotted on the same graph. animal makes either response with equal prob-
Demonstrates the scalar property. ability) occurs at the geometric mean of the two
Superstitious behavior  A behavior that reinforced durations (e.g., 4 seconds if 2 and 8
increases in strength or frequency because of second cues have been reinforced).
accidental pairings with a reinforcer. Temporal context  Contextual stimuli that
Suppression ratio  The measure of condition- change with the passage of time.
ing used in the conditioned emotional response Temporal generalization  A procedure for
or conditioned suppression method. It is the value studying interval timing in which an animal is
obtained by dividing the number of responses first reinforced if it responds after stimuli of a
made during the conditional stimulus by the specific duration and then stimuli of increasing
sum of the responses made during the condi- and/or decreasing durations are tested.
tional stimulus and during an equal period of Terminal behavior  Stereotyped behaviors that
time before the stimulus. If the value is .50, no occur toward the end of the interval between
conditioned suppression has occurred. If the regularly delivered reinforcers.
value is 0, a maximum amount of conditioned Theoretical construct  See intervening variable.
suppression has occurred.
Thomas Hobbes  (1588–1679) A philosopher
Surprisingness of the CS  The difference who suggested that human thoughts and ac-
between the actual properties of a conditional tions follow the principle of hedonism.
stimulus and those already predicted or repre-
sented (primed) in short-term memory. Trace conditioning  A classical conditioning
procedure in which the unconditional stimulus
Surprisingness of the US  The difference is presented after the conditional stimulus has
between the actual magnitude of the uncondi- been terminated.
tional stimulus and that which is predicted by
conditional stimuli present on a conditioning Trace decay  The theoretical idea that forget-
trial. In the Rescorla-Wagner model, learning ting is due to the actual loss or destruction of
only occurs if there is a discrepancy between the information that is stored in memory.
unconditional stimulus that is predicted and the Transfer tests  A procedure in which an organ-
one that actually occurs. ism is tested with new stimuli or with old
SD (S–)  A discriminative stimulus that sup- stimuli in a new situation. In categorization
presses operant responding because it signals a experiments, this is the method of testing the
decrease in the availability of reinforcement or animal’s ability to categorize stimuli it has not
sets the occasion for not responding. categorized before.
Transfer-of-control experiments  Experiments
T that test for Pavlovian-instrumental transfer, and
Tabula rasa  The view, endorsed by the British thus demonstrate the effects of presenting a
Empiricists, that the mind is a “blank slate” Pavlovian conditional stimulus on the rate of an
before it is written upon by experience. ongoing instrumental behavior.
Target stimulus  In feature-positive and feature- Transposition  Differential responding to two
negative discriminations, the conditional stimulus stimuli, apparently according to their relation
that is present on every trial. rather than their absolute properties or individ-
Taste aversion learning  The phenomenon in ual features. For example, after discrimination
which a taste is paired with sickness, and this training with two stimuli that differ along a di-
causes the organism to reject that taste in the mension (e.g., size), the organism might choose
future. a more extreme stimulus along the dimension
Taste-reactivity test  A method in which exper- rather than the stimulus that was previously
imenters examine the rat’s behavioral reactions reinforced.
to tastes delivered directly to the tongue. Two-factor theory (two-process theory)  A
Temporal bisection  A procedure used to theory of avoidance learning that states that (1)
study interval timing in which one response is Pavlovian fear learning allows warning stimuli
reinforced after a signal of one duration, and to evoke conditioned fear that motivates avoid-
another response is reinforced after a signal ance behavior and provides the opportunity for
of another duration. When responding to (2) reinforcement of the instrumental avoidance
480  GLOSSARY

response through fear reduction. More gener- Variable ratio schedule  A schedule of re-
ally, the theoretical idea that Pavlovian learning inforcement in which a variable number of
is always a second process at work in instru- responses are required for delivery of each
mental learning situations. reinforcer.
U W
Unconditional response (UR)  In classical con- Warning signals  Environmental stimuli in
ditioning, an innate response that is elicited by avoidance learning situations that are associat-
a stimulus in the absence of conditioning. ed with the aversive stimulus through Pavlov-
Unconditional stimulus (US)  In classical con- ian conditioning.
ditioning, the stimulus that elicits the response Water maze  An apparatus used to investigate
before conditioning occurs. spatial learning in which the rat or mouse sub-
US preexposure effect  Interference with ject swims in a circular pool of milky water to
conditioning that is produced by repeated find a submerged platform on which to stand.
exposures to the unconditional stimulus before Within-compound association  A learned
conditioning begins. association that may be formed between two
conditional stimuli when they are presented
V
together in a compound.
Variable interval schedule  A schedule of rein-
forcement in which the behavior is reinforced Working memory  A system for temporarily
the first time it occurs after a variable amount of holding and manipulating information; another
time since the last reinforcer. name for short-term memory.
References
A Allen, T. A., & Fortin, N. J. (2013). The evolution of
Abramson, L. Y., Seligman, M. E. P., & Teasdale, J. episodic memory. Proceedings of the National Academy
(1978). Learned helplessness in humans: Critique of Sciences, 110, 10379-10386.
and reformulation. Journal of Abnormal Psychology, Allison, J. (1979). Demand economics and experimen-
87, 49–74. tal psychology. Behavioral Science, 24, 403–415.
Adams, C. D. (1982). Variations in the sensitivity of Allison, J. (1983). Behavioral substitutes and comple-
instrumental responding to reinforcer devaluation. ments. In R. L. Malgren (Ed.), Animal cognition and
Quarterly Journal of Experimental Psychology, 34B, behavior. Amsterdam: North-Holland.
77–98. Allison, J. (1989). The nature of reinforcement. In S. B.
Adams, C. D., & Dickinson, A. (1981). Instrumental Klein & R. R. Mowrer (Eds.), Contemporary learn-
responding following reinforcer devaluation. Quar- ing theories: Instrumental conditioning theory and the
terly Journal of Experimental Psychology, 33B, 109–121. impact of biological constraints on learning (pp. 13–39).
Adams, C. D., & Dickinson, A. (1982). Variations in Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
the sensitivity of instrumental responding to rein- Allison, J., & Timberlake, W. (1974). Instrumental and
forcer devaluation. Quarterly Journal of Experimental contingent saccharin licking in rats: Response depri-
Psychology, 34B, 77–98. vation and reinforcement. Learning and Motivation,
Aguado, L., Symonds, M., & Hall, G. (1994). Inter- 5, 231–247.
val between preexposure and test determines the Allman, M. J., Teki, S., Griffiths, T. D., & Meck, M. H.
magnitude of latent inhibition: Implications for an (2014). Properties of the internal clock: First- and
interference account. Animal Learning & Behavior, 22, second-order principles of subjective time. Annual
188–194. Review of Psychology, 65, 743–771.
Ainslie, G. (1975). Specious reward: A behavioral Allswede D. M., Curley, K., Cullen, N., & Batsell, W.
theory of impulsiveness and impulse control. Psy- R. (2014). Within-compound associations mediated
chological Bulletin, 82, 463–496. augmented flavor aversions. Learning and Motiva-
Akins, C. K. (2000). Effects of species-specific cues and tion, 48, 33–46.
the CS-US interval on the topography of the sexu- Amsel, A. (1958). The role of frustrative nonreward
ally conditioned response. Learning and Motivation, in noncontinuous reward situations. Psychological
31, 211–235. Bulletin, 55, 102–119.
Akins, C. K., Domjan, M., & Gutierrez, G. (1994). To- Amsel, A. (1962). Frustrative nonreward in partial
pography of sexually conditioned behavior in male reinforcement and discrimination learning: Some re-
Japanese quail (Coturnix japonica) depends on the cent history and a theoretical extension. Psychological
CS-US interval. Journal of Experimental Psychology: Review, 69, 306–328.
Animal Behavior Processes, 20, 199–209. Amsel, A. (1992). Frustration theory: An analysis of dis-
Alcock, J. (2013). Animal behavior: An evolutionary ap- positional learning and memory. Cambridge, England:
proach (10th ed.). Sunderland, MA: Sinauer Associates. Cambridge University Press.
Allan, L. G., & Gibbon, J. (1991). Human bisection Amsel, A., & Roussel, J. (1952). Motivational proper-
at the geometric mean. Learning and Motivation, 22, ties of frustration: I. Effect on a running response
39–58. of the addition of frustration to the motivational
482  References

complex. Journal of Experimental Psychology, 43, Atkinson, R. C., & Shiffrin, R. M. (1971). The control of
363–368. short-term memory. Scientific American, 225, 82–90.
Amsel, A., & Ward, J. S. (1965). Frustration and persis- Audrain-McGovern, J., Rodriguez, D., Epstein, L. H.,
tence: Resistance to discrimination following prior Cuevas, J., Roders, K., & Wileyto, E. P. (2009). Does
experience with the discriminanda. Psychological delay discounting play an etiological role in smok-
Monographs (General and Applied), 79, 41. ing or is it a consequence of smoking? Drug and
Anderson, J. R. (1995). Learning and memory: An inte- Alcohol Dependence, 103, 99–106.
grated approach. New York: John Wiley. Ayres, J. J. B. (1998). Fear conditioning and avoid-
Andrade, L. F., & Petry, N. M. (2014). Contingency ance. In W. O’Donohue (Ed.), Learning and behavior
management treatments for substance-use disorders therapy (pp. 122–145). Needham Heights, MA: Allyn
and healthy behaviors. In F. K. McSweeney & E. & Bacon.
S. Murphy (Eds.), The Wiley Blackwell handbook of Ayres, J. J., Haddad, C., & Albert, M. (1987). One-trial
operant and classical conditioning (pp. 627–644). West excitatory backward conditioning as assessed by
Sussex, UK: John Wiley & Sons. conditioned suppression of licking in rats: Concur-
Andresen, G. V., Birch, L. L., & Johnson, P. A. (1990). rent observations of lick suppression and defensive
The scapegoat effect on food aversions after chemo- behaviors. Animal Learning & Behavior, 15, 212–217.
therapy. Cancer, 66, 1649–1653. Azrin, N. H. (1960). Effects of punishment intensity
Anger, D. (1963). The role of temporal discrimination during variable-interval reinforcement. Journal of the
in the reinforcement of Sidman avoidance behavior. Experimental Analysis of Behavior, 3, 123–142.
Journal of the Experimental Analysis of Behavior, 6, B
477–506. Babb, S. J., & Crystal, J. D. (2006). Episodic-like
Anisman, H., DeCatanzaro, D., & Remington, G. memory in the rat. Current Biology, 16, 1317–1321.
(1978). Escape performance following exposure to Baerends, G. P. (1950). Specializations in organs and
inescapable shock: Deficits in motor response main- movements with a releasing function, Society for Ex-
tenance. Journal of Experimental Psychology: Animal perimental Biology. Physiological mechanisms in animal
Behavior Processes, 4, 197–218. behavior (Society’s Symposium IV) (pp. 337–360).
Annau, Z., & Kamin, L. J. (1961). The conditioned Baerends, G. P. (1976). The functional organization of
emotional response as a function of intensity of the behaviour. Animal Behaviour, 24, 726–738.
US. Journal of Comparative and Physiological Psychol- Baker, A. G. (1974). Conditioned inhibition is not the
ogy, 54, 428–432. symmetrical opposite of conditioned excitation:
Anokhin, K. V., Tiunova, A. A., & Rose, S. P. R. (2002). A test of the Rescorla-Wagner model. Learning and
Reminder effects—Reconsolidation or retrieval Motivation, 5, 369–379.
deficit? Pharmacological dissection with protein Baker, A. G. (1977). Conditioned inhibition aris-
synthesis inhibitors following reminder for a ing from a between-sessions negative correlation.
passive-avoidance task in young chicks. European Journal of Experimental Psychology: Animal Behavior
Journal of Neuroscience, 15, 1759–1765. Processes, 3, 144–155.
Arcediano, F., Matutue, H., Escobar, M., & Miller, Baker, A. G., & Mackintosh, N. J. (1977). Excitatory
R. R. (2005). Competition between antecedent and and inhibitory conditioning following uncorrelated
between subsequent stimuli in causal judgments. presentations of CS and UCS. Animal Learning &
Journal of Experimental Psychology: Learning, Memory, Behavior, 5, 315–319.
and Cognition, 31, 228–237. Baker, A. G., & Mercier, P. (1982). Extinction of the
Asen, Y., & Cook, R. G. (2012). Discrimination and context and latent inhibition. Learning and Motiva-
categorization of actions by pigeons. Psychological tion, 13, 391–416.
Science, 23, 617–624. Baker, A. G., Mercier, P., Vallee-Tourangeau, F., Frank,
Astley, S. L., & Wasserman, E. A. (1992). Categorical R., & Pan, M. (1993). Selective associations and
discrimination and generalization in pigeons: All causality judgments: Presence of a strong causal fac-
negative stimuli are not created equal. Journal of tor may reduce judgments of a weaker one. Journal
Experimental Psychology: Animal Behavior Processes, of Experimental Psychology: Learning, Memory, and
18, 193–207. Cognition, 19, 414–432.
Astley, S. L., & Wasserman, E. A. (1999). Superor- Baker, A. G., Murphy, R. A., & Vallee-Tourangeau, F.
dinate category formation in pigeons: Associa- (1996). Associative and normative models of causal
tion with a common delay or probability of food induction: Reacting to versus understanding cause.
reinforcement makes perceptually dissimilar stimuli In D. R. Shanks, K. Holyoak, & D. L. Medin (Ed.),
functionally equivalent. Journal of Experimental Psy- Causal Learning (pp. 1–45). San Diego: Academic
chology: Animal Behavior Processes, 25, 415–432. Press.
Astley, S. L., Peissig, J. J., & Wasserman, E. A. (2001). Baker, A. G., Murphy, R., Mehta, R., & Baetu, I. (2005).
Superordinate categorization via learned stimulus Mental models of causation: A comparative view. In
equivalence: Quantity of reinforcement, hedonic A. J. Wills (Ed.), New directions in human associative
value, and the nature of the mediator. Journal of learning. Mahwah, NJ: Erlbaum.
Experimental Psychology: Animal Behavior Processes, Baker, T. B., & Tiffany, S. T. (1985). Morphine tolerance
27, 252–268. as habituation. Psychological Review, 92, 78–108.
References  483

Baker, T. B., Piper, M. E., McCarthy, D. E., Majeskie, Barash, D. (1982). Sociobiology and behavior (2nd ed.).
M. R., & Fiore, M. C. (2004). Addiction motivation New York: Elsevier.
reformulated: An affective processing model of Bargh, J. A., & Chartrand, T. L. (1999). The unbear-
negative reinforcement. Psychological Review, 111, able automaticity of being. American Psychologist, 54,
33–51. 462–479.
Balaz, M. A., Kasprow, W. J., & Miller, R. R. (1982). Barlow, D. H. (1988). Anxiety and its disorders: The
Blocking with a single compound trial. Animal nature and treatment of anxiety and panic. New York:
Learning & Behavior, 10, 271–276. Guilford Press.
Balleine, B. W. (1992). Instrumental performance Barlow, D. H. (2002). Anxiety and its disorders: The na-
following a shift in primary motivation depends on ture and treatment of anxiety and panic (2nd ed.). New
incentive learning. Journal of Experimental Psychology: York: Guilford Press.
Animal Behavior Processes, 18, 236–250. Barnes, J. M., & Underwood, B. J. (1959). “Fate” of
Balleine, B. W. (1994). Asymmetrical interactions be- first-list associations in transfer theory. Journal of
tween thirst and hunger in Pavlovian-instrumental Experimental Psychology, 58, 97–105.
transfer. The Quarterly Journal of Experimental Psychol- Bartlett, F. C. (1932). Remembering: A study in experi-
ogy, 47B, 211–231. mental and social psychology. Cambridge: Cambridge
Balleine, B. W. (2001). Incentive processes in instru- University Press.
mental conditioning. In R. R. Mowrer, & S. B. Klein Basile, B. M., Schroeder, G. R., Brown, E. K., Templer,
(Ed.), Handbook of contemporary learning theories (pp. V. L., & Hampton, R. R. (2015). Evaluation of seven
307–366). Hillsdale, NJ: Erlbaum. hypotheses for metamemory performance in rhesus
Balleine, B. W. (2005). Neural bases of food-seeking: monkeys. Journal of Experimental Psychology: General,
Affect, arousal and reward in corticostriatolimbic 144, 85–102.
circuits. Physiology and Behavior, 86, 717–730. Basoglu, M., Marks, I. M., & Sengun, S. (1992). A pro-
Balleine, B. W., & Dickinson A. (1991). Instrumen- spective study of panic and anxiety in agoraphobia
tal performance following reinforcer devaluation with panic disorder. British Journal of Psychiatry, 160,
depends upon incentive learning. Quarterly Journal 57–64.
of Experimental Psychology, 43B, 279–296. Batsell, W. R., & Blankenship, A. G. (2002). Beyond
Balleine, B. W., & Dickinson, A. (1994). Role of cho- potentiation: Synergistic conditioning in flavor-aver-
lecystokinin in the motivational control of instru- sion learning. Brain and Mind, 3, 383–408.
mental action in rats. Behavioral Neuroscience, 108, Batsell, W. R., Jr., & Batson, J. D. (1999). Augmenta-
590–605. tion of taste conditioning by a preconditioned odor.
Balleine, B. W., & Dickinson A. (1998). Consciousness: Journal of Experimental Psychology: Animal Behavior
The interface between affect and cognition. In J. Processes, 25, 374–388.
Cornwell (Ed.), Consciousness and human identity (pp. Batsell, W. R., Paschall, G. Y., Gleason, D. I., & Batson,
57–85). Oxford: Oxford University Press. J. D. (2001). Taste preconditioning augments odor-
Balleine, B. W., & O’Doherty, J. P. (2010). Human and aversion learning. Journal of Experimental Psychology:
rodent homologies in action control: Corticostriatal Animal Behavior Processes, 28, 30–47.
determinants of goal-directed and habitual action. Batsell, W. R., Trost, C. A., Cochran, S. R., Blanken-
Neuropsychopharmacology, 35, 48–69. ship, A. G., & Batson, J. D. (2003). Effects of post-
Balleine, B. W., Garner, C., & Dickinson, A. (1995). conditioning inflation on odor + taste compound
Instrumental outcome devaluation is attenuated conditioning. Learning & Behavior, 31, 173–184.
by the anti-emetic ondansetron. Quarterly Journal of Batsell, W. R., Wakefield, E., Ulrey, L. A., Reimink, K.,
Experimental Psychology, 48, 235–251. Rowe, S. L., & Dexheimer, S. (2012). CS-US interval
Balleine, B. W., Garner, C., Gonzalez, F., & Dickinson, determines the transition from overshadowing to
A. (1995). Motivational control of heterogeneous in- potentiation with flavor compounds. Learning &
strumental chains. Journal of Experimental Psychology: Behavior, 40, 180–194.
Animal Behavior Processes, 21, 203–217. Batson, J. D., & Batsell, W. R. (2000). Augmentation,
Balsam, P. (1984). Relative time in trace condition- not blocking, in an A+/AX+ flavor-conditionaing
ing. Annals of the New York Academy of Sciences, 423, procedure. Psychonomic Bulletin & Review, 7, 466–471.
211–227. Baum, W. M. (1973). The correlation-based law of
Balsam, P. D., & Payne, D. (1979). Intertrial interval effect. Journal of the Experimental Analysis of Behavior,
and unconditioned stimulus durations in autoshap- 20, 137–153.
ing. Animal Learning & Behavior, 7, 477–482. Baum, W. M. (1974). On two types of deviation from
Balsam, P. D., & Tomie, A. (Eds.). (1985). Context and the matching law: Bias and undermatching. Journal
learning. Hillsdale, NJ: Lawrence Erlbaum Associ- of the Experimental Analysis of Behavior, 22, 231–242.
ates, Inc. Baum, W. M. (2001). Molar versus molecular as a
Balsam, P. D., Deich, J. D., Ohyama, T., & Stokes, P. D. paradigm clash. Journal of the Experimental Analysis of
(1998). Origins of new behavior. In W. T. O’Donohue Behavior, 75, 338–341.
(Ed.), Learning and behavior therapy (pp. 403–420). Beatty, W. W., & Shavalia, D. A. (1980). Rat spatial
Boston, MA: Allyn and Bacon. memory: Resistance to retroactive interference at
484  References

long retention intervals. Animal Learning & Behavior, Best, M. R., Brown, E. R., & Sowell, M. K. (1984).
8, 550–552. Taste-mediated potentiation of noningestional
Beatty, W. W., & Shavalia, D. A. (1980). Spatial stimuli in rats. Learning and Motivation, 15, 244–258.
memory in rats: Time course of working memory Best, M. R., Gemberling, G. A., & Johnson, P. E. (1979).
and effect of anesthetics. Behavioral and Neural Biol- Disrupting the conditioned stimulus preexposure ef-
ogy, 28, 454–462. fect in flavor-aversion learning: Effects of interocep-
Beck, H. P., Levinson, S., & Irons, G. (2009). Finding tive distractor manipulations. Journal of Experimental
little Albert: A journey to John B. Watson’s infant Psychology: Animal Behavior Processes, 5, 321–334.
laboratory. American Psychologist, 64, 605–614. Betts, S. L., Brandon, S. E., & Wagner, A. R. (1996).
Beckers, T., De Houwer, J., Pineno, O., & Miller, R. R. Dissociation of the blocking of conditioned eyeblink
(2005). Outcome additivity and outcome maximality and conditioned fear following a shift in US locus.
influence cue competition in human causal learning. Animal Learning & Behavior, 24, 459–470.
Journal of Experimental Psychology: Learning, Memory, Bevins, R. A., & Bardo, M. T. (Ed.). (2004). Motivational
and Cognition, 31, 238–249. factors in the etiology of drug abuse (Vol. 50). Lincoln,
Beckers, T., Miller, R. R., De Houwer, J., & Urushihara, NE: University of Nebraska Press.
K. (2006). Reasoning rats: Forward blocking in Pav- Bhatt, R. S. (1988). Categorization in pigeons: Effects
lovian animal conditioning is sensitive to constraints of category size, congruity with human categories,
of causal inference. Journal of Experimental Psychol- selective attention, and secondary generalization.
ogy: General, 135, 92–102. Dissertation Abstracts International, 50, 1668.
Beesely, T., & Le Pelley, M. (2011). The influence of Bhatt, R. S., Wasserman, E. A., Reynolds, W. F.,
blocking on overt attention and associability in & Knauss, K. S. (1988). Conceptual behavior in
human learning. Journal of Experimental Psychology: pigeons: Categorization of both familiar and novel
Animal Behavior Processes, 37, 114–120. examples from four classes of natural and artificial
Beesley, T., Nguyen, K. P., Pearson, D., & Le Pelley, M. stimuli. Journal of Experimental Psychology: Animal
E. (2015). Uncertainty and predictiveness determine Behavior Processes, 14, 219–234.
attention to cues during human associative learning. Bickel, W. K., & Johnson, M. W. (2003). Delay dis-
The Quarterly Journal of Experimental Psychology, 68, counting: A fundamental behavioral process of
2175–2199. drug dependence. In G. Loewenstein, D. Read, &
Belli, R. F., & Loftus, E. F. (1996). The pliability of R. Baumeister (Ed.), Time and decision: Economic and
autobiographical memory: Misinformation and the psychological perspectives on intertemporal choice (pp.
false memory problem. In D. C. Rubin (Ed.), Remem- 419–440). New York: Russell Sage Foundation.
bering our past: Studies in autobiographical memory Bickel, W. K., & Marsch, L. A. (2001). Toward a behav-
(pp. 157–179). Cambridge, England: Cambridge ioral economic understanding of drug dependence:
University Press. Delay discounting processes. Addiction, 96, 73–86.
Bernstein, I. L. (1978). Learned taste aversions in Bickel, W. K., DeGrandpre, R. J., & Higgins, S. T.
children receiving chemotherapy. Science, 200, (1995). The behavioral economics of concurrent
1302–1303. drug reinforcers: A review and reanalysis of drug
Bernstein, I. L., & Webster, M. M. (1980). Learned taste self-administration research. Psychopharmacology,
aversions in humans. Physiology and Behavior, 25, 118, 250–259.
363–366. Bickel, W. K., Johnson, M. W., Koffarnus, M. N.,
Berridge, K. C. (1996). Food reward: Brain circuits of MacKillop, J., & Murphy, J. G. (2014). The behavioral
wanting and liking. Neuroscience and Biobehavioral economics of substance use disorders: Reinforce-
Reviews, 20, 1–25. ment pathologies and their repair. Annual Review of
Berridge, K. C., & Robinson, T. E. (1995). The mind of Clinical Psychology, 10, 641–677.
an addicted brain: Neural sensitization of wanting Bickel, W. K., Odum, A. L., & Madden, G. J. (1999).
versus liking. Current Directions in Psychological Sci- Impulsivity and cigarette smoking: Delay discount-
ence, 4, 71–76. ing in current, never, and ex-smokers. Psychopharma-
Best, M. R. (1975). Conditioned and latent inhibition cology, 146, 447–454.
in taste-aversion learning: Clarifying the role of Bickel, W. K., Yi, R., Landes, R. D., Hill, P. F., & Baxter,
learned safety. Journal of Experimental Psychology: C. (2011). Remember the future: Working memory
Animal Behavior Processes, 1, 97–113. training decreases delay discounting among stimu-
Best, M. R., Batson, J. B., Meachum, C. L., Brown, E. lant addicts. Biological Psychiatry, 69, 260–265.
R., & Ringer, M. R. (1985). Characteristics of taste- Biebach, H., Gordijn, M., & Krebs, J. R. (1989). Time-
mediated environmental potentiation in rats. Learn- and-place learning by garden warblers, Sylvia borin.
ing and Motivation, 16, 190–209. Animal Behaviour, 37, 353–360.
Best, M. R., Batson, J. D., & Bowman, M. T. (1990). The Biederman, I. (1987). Recognition-by-components: A
role of ingestional delay in taste-mediated envi- theory of human image understanding. Psychological
ronmental potentiation. Bulletin of the Psychonomic Review, 94, 115–147.
Society, 28, 215–218. Biegler, R., & Morris, R. G. M. (1999). Blocking in the
spatial domain with arrays of discrete landmarks.
References  485

Journal of Experimental Psychology: Animal Behavior Blough, D. S., & Blough, P. M. (1997). Form perception
Processes, 25, 334–351. and attention in pigeons. Animal Learning & Behavior,
Bindra, D. (1972). A unified account of Pavlov- 25, 1–20.
ian conditioning and operant training. In H. A. Blough, P. M. (1989). Attention priming and visual
Black, & W. F. Prokasy (Ed.), Classical condition- search in pigeons. Journal of Experimental Psychology:
ing II: Current research and theory. New York: Animal Behavior Processes, 15, 358–365.
Appleton-Century-Crofts. Blough, P. M. (1991). Selective attention and search
Birch, L. L. (1991). Obesity and eating disorders: A de- images in pigeons. Journal of Experimental Psychology:
velopmental perspective. Bulletin of the Psychonomic Animal Behavior Processes, 17, 292–298.
Society, 29, 265–272. Blough, P. M., & Lacourse, D. M. (1994). Sequential
Birch, L. L., & Fisher, J. A. (1996). The role of experi- priming in visual search: Contributions of stimulus-
ence in the development of children’s eating behav- driven facilitation and learned expectancies. Animal
ior. In E. D. Capaldi (Ed.), Why we eat what we eat: Learning & Behavior, 22, 275–281.
The psychology of eating (pp. 113–141). Washington, Boakes, R. (1984). From Darwin to behaviourism:
DC: American Psychological Association. Psychology and the minds of animals. Cambridge,
Birch, L. L., McPhee, L., Shoba, B. C., Steinberg, L., & England: Cambridge University Press.
Krehbiel, R. (1987). “Clean up your plate”: Effects of Boakes, R. A. (1977). Performance on learning to as-
child feeding practices on the conditioning of meal sociate a stimulus with positive reinforcement. In H.
size. Learning and Motivation, 18, 301–317. Davis & H. M. B. Hurwitz (Eds.), Operant-Pavlovian
Birch, L. L., McPhee, L., Steinberg, L., & Sullivan, S. interactions (pp. 67–97). Hillsdale, NJ: Erlbaum.
(1990). Conditioned flavor preferences in young Boakes, R. A. (1997). Wheels, clocks, and anorexia
children. Physiology and Behavior, 47, 501–505. in the rat. In M. E. Bouton & M. S. Fanselow
Bitterman, M. E. (1988). Vertebrate-invertebrate (Eds.), Learning, motivation, and cognition (pp.
comparisons. In H. J. Jerison & I. Jerison (Eds.), Intel- 163–176). Washington DC: American Psychological
ligence and evolutionary biology (pp. 251–276). Berlin: Association.
Springer. Boakes, R. A. (2015). Reply to Pellón and Killeen’s
Bitterman, M. E. (1996). Comparative analysis of (2015) commentary on Boakes, Patterson, Kendig,
learning in honeybees. Animal Learning & Behavior, and Harris (2015). Journal of Experimental Psychology:
24, 123–141. Animal Learning and Cognition, 41, 452–453.
Bitterman, M. E. (2000). Cognitive evolution: A Boakes, R. A., & Costa, D. S. J. (2014). Temporal conti-
psychological perspective. In C. Heyes, & L. Huber guity in associative learning: Interference and decay
(Ed.), The evolution of cognition (pp. 61–79). Cam- from an historical perspective. Journal of Experimen-
bridge, MA: The MIT Press. tal Psychology: Animal Learning and Cognition, 40,
Bizo, L. A., & White, K. G. (1995). Reinforcement con- 381–400.
text and pacemaker rate in the behavioral theory of Boakes, R. A., Colagiuri, B., & Mahon, M. (2010).
timing. Animal Learning & Behavior, 23, 376–382. Learned avoidance of flavors signaling reduction in
Black, A. H. (1958). The extinction of avoidance a nutrient. Journal of Experimental Psychology: Animal
responses under curare. Journal of Comparative and Behavior Processes, 36, 117–125.
Physiological Psychology, 51, 519–524. Boakes, R. A., Patterson, A. E., & Kwok, D. W. S.
Black, A. H. (1959). Heart rate changes during avoid- (2012). Flavor avoidance learning based on missing
ance learning in dogs. Canadian Journal of Psychology, calories but not on palatability reduction. Learning &
13, 229–242. Behavior, 40, 542–550.
Blaisdell, A. P., Bristol, A. S., Gunther, L. M., & Miller, Boggiano, M. M., Dorsey, J. R., Thomas, J. M., &
R. R. (1998). Overshadowing and latent inhibition Murdaugh, D. L. (2009). The Pavlovian power of
counteract each other: Support for the compara- palatable food: Lessons for weight-loss adherence
tor hypothesis. Journal of Experimental Psychology: from a new rodent model of cue-induced overeat-
Animal Behavior Processes, 24, 335–351. ing. International Journal of Obesity, 33, 693–701.
Blanchard, R. J., & Blanchard, D. C. (1969). Crouching Bolles, R. C. (1962). The readiness to eat and drink:
as an index of fear. Journal of Comparative and Physi- The effect of deprivation conditions. Journal of Com-
ological Psychology, 67, 370–375. parative and Physiological Psychology, 55, 230–234.
Blodgett, H. C., McCutchan, K., & Mathews, R. (1949). Bolles, R. C. (1965). Readiness to eat: Effects of age,
Spatial learning in the T maze: The influence of di- sex, and weight loss. Journal of Comparative and
rection, turn, and food location. Journal of Experimen- Physiological Psychology, 60, 88–92.
tal Psychology, 39, 800–809. Bolles, R. C. (1967). Theory of motivation. New York:
Blough, D. S. (1969). Attention shifts in a maintained Harper and Row.
discrimination. Science, 166, 125–126. Bolles, R. C. (1969). Avoidance and escape learning:
Blough, D. S. (2002). Measuring the search image: Ex- Simultaneous acquisition of different responses.
pectation, detection and recognition in pigeon visual Journal of Comparative and Physiological Psychology, 68,
search. Journal of Experimental Psychology: Animal 355–358.
Behavior Processes, 28, 397–405.
486  References

Bolles, R. C. (1970). Species-specific defense reactions Bonardi, C., & Jennings, D. (2009). Learning about
and avoidance learning. Psychological Review, 77, associations: Evidence for a hierarchical account of
32–48. occasion setting. Journal of Experimental Psychology:
Bolles, R. C. (1972a). Reinforcement, expectancy, and Animal Behavior Processes, 35, 440–445.
learning. Psychological Review, 79, 394–409. Bonardi, C., & Ong, S. Y. (2003). Learned irrelevance:
Bolles, R. C. (1972b). The avoidance learning problem. A contemporary overview. Quarterly Journal of Ex-
In G. H. Bower (Ed.), The psychology of learning and perimental Psychology B: Comparative and Physiological
motivation. New York: Academic Press. Psychology, 56B, 80–89.
Bolles, R. C. (1975). Theory of motivation (2nd ed.). New Bond, A. B. (1983). Visual search and selection of
York: Harper and Row. natural stimuli in the pigeon: The attention thresh-
Bolles, R. C. (1978). The role of stimulus learning in old hypothesis. Journal of Experimental Psychology:
defensive behavior. In S. H. Hulse, H. Fowler, & W. Animal Behavior Processes, 9, 292–306.
K. Honig (Ed.), Cognitive processes in animal behavior Bond, A. B., & Kamil, A. C. (1998). Apostatic selection
(pp. 89–107). Hillsdale, NJ: Erlbaum. by blue jays produces balanced polymorphism in
Bolles, R. C. (1979). Learning theory. New York: Holt, virtual prey. Nature, 395, 594–596.
Rinhart and Winston. Bond, A. B., & Kamil, A. C. (1999). Searching image in
Bolles, R. C. (1993). The story of psychology: A thematic blue jays: Facilitation and interference in sequential
history. Belmont, CA: Brooks-Cole. priming. Animal Learning & Behavior, 27, 461–471.
Bolles, R. C., & de Lorge, J. (1962). The rat’s adjust- Bond, A. B., & Kamil, A. C. (2002). Visual predators
ment to a-diurnal feeding cycles. Journal of Compara- select for crypticity and polymorphism in virtual
tive and Physiological Psychology, 55, 760–762. prey. Nature, 415, 609–613.
Bolles, R. C., & Fanselow, M. S. (1980). A perceptual Bond, A. B., & Kamil, A. C. (2006). Spatial heterogene-
defensive-recuperative model of fear and pain. ity, predator cognition, and the evolution of color
Behavioral and Brain Sciences, 3, 291–323. polymorphism in virtual prey. Proceedings of the
Bolles, R. C., & Grossen, N. E. (1969). Effects of an National Academy of Sciences, 103, 3214–3219.
informational stimulus on the acquisition of avoid- Bossert, J. M., Liu, S. Y., Lu, L., & Shaham, Y. (2004).
ance behavior in rats. Journal of Comparative and A role of ventral tegmental area glutamate in con-
Physiological Psychology, 68, 90–99. textual cue-induced relapse to heroin seeking. The
Bolles, R. C., & Moot, S. A. (1972). Derived motives. Journal of Neuroscience, 24, 10726–10730.
Annual Review of Psychology, 23, 51–72. Bouton, M. E. (1984). Differential control by context in
Bolles, R. C., & Moot, S. A. (1973). The rat’s anticipa- the inflation and reinstatement paradigms. Journal
tion of two meals a day. Journal of Comparative and of Experimental Psychology: Animal Behavior Processes,
Physiological Psychology, 83, 510–514. 10, 56–74.
Bolles, R. C., & Riley, A. L. (1973). Freezing as an Bouton, M. E. (1986). Slow reacquisition following the
avoidance response: Another look at the operant- extinction of conditioned suppression. Learning and
respondent distinction. Learning and Motivation, 4, Motivation, 17, 1–15.
268–275. Bouton, M. E. (1988). Context and ambiguity in the
Bolles, R. C., & Stokes, L. W. (1965). Rat’s anticipation extinction of emotional learning: Implications for
of diurnal and a-diurnal feeding. Journal of Compara- exposure therapy. Behaviour Research & Therapy, 26,
tive and Physiological Psychology, 60, 290–294. 137–149.
Bolles, R. C., Collier, A. C., Bouton, M. E., & Mar- Bouton, M. E. (1991). Context and retrieval in extinc-
lin, N. A. (1978). Some tricks for ameliorating the tion and in other examples of interference in simple
trace-conditioning deficit. Bulletin of the Psychonomic associative learning. In L. Dachowski & C. F. Fla-
Society, 11, 403–406. herty (Eds.), Current topics in animal learning: Brain,
Bolles, R. C., Hayward, L., & Crandall, C. (1981). Con- emotion, and cognition. (pp. 25–53). Hillsdale, NJ:
ditioned taste preferences based on caloric density. Lawrence Erlbaum Associates, Inc.
Journal of Experimental Psychology: Animal Behavior Bouton, M. E. (1993). Context, time, and memory
Processes, 7, 59–69. retrieval in the interference paradigms of Pavlovian
Bolles, R. C., Holtz, R., Dunn, T., & Hill, W. (1980). learning. Psychological Bulletin, 114, 80–99.
Comparison of stimulus learning and response Bouton, M. E. (1994a). Conditioning, remembering,
learning in a punishment situation. Learning and and forgetting. Journal of Experimental Psychology:
Motivation, 11, 78–96. Animal Behavior Processes, 20, 219–231.
Bolles, R. C., Stokes, L. W., & Younger, M. S. (1966). Bouton, M. E. (1994b). Context, ambiguity, and classi-
Does CS termination reinforce avoidance behavior? cal conditioning. Current Directions in Psychological
Journal of Comparative and Physiological Psychology, 62, Science, 3, 49–53.
201–207. Bouton, M. E. (1997). Signals for whether versus
Bonardi, C., & Hall, G. (1996). Learned irrelevance: when an event will occur. In M. E. Bouton & M. S.
No more than the sum of CS and US preexposure Fanselow (Eds.), Learning, motivation, and cognition:
effects? Journal of Experimental Psychology: Animal The functional behaviorism of Robert C. Bolles. (pp.
Behavior Processes, 22, 183–191. 385–409). Washington, DC: American Psychological
Association.
References  487

Bouton, M. E. (2000). A learning theory perspective Experimental Psychology: Animal Learning and Cogni-
on lapse, relapse, and the maintenance of behavior tion 41, 81–90.
change. Health Psychology, 19(Suppl.), 57–63. Bouton, M. E., & Swartzentruber, D. (1986). Analysis
Bouton, M. E. (2002). Context, ambiguity, and un- of the associative and occasion-setting properties of
learning: Sources of relapse after behavioral extinc- contexts participating in a Pavlovian discrimination.
tion. Biological Psychiatry, 52, 976–986. Journal of Experimental Psychology: Animal Behavior
Bouton, M. E. (2004). Context and behavioral process- Processes, 12, 333–350.
es in extinction. Learning and Memory, 11, 485–494. Bouton, M. E., & Swartzentruber, D. (1991). Sources of
Bouton, M. E. (2005). Behavior systems and the relapse after extinction in Pavlovian and instrumen-
contextual control of anxiety, fear, and panic. In L. F. tal learning. Clinical Psychology Review, 11, 123–140.
Barrett, P. Niedenthal, & P. Winkielman (Eds.), Emo- Bouton, M. E., & Whiting, M. R. (1982). Simultaneous
tion: Conscious and unconcscious (Vol. 205–227). New odor-taste and taste-taste compounds in poison-
York: Guilford Press. avoidance learning. Learning and Motivation, 13,
Bouton, M. E. (2010). The multiple forms of “context” 472–494.
in associative learning theory. In B. Mesquita, L. Bouton, M. E., Doyle-Burr, C., & Vurbic, D. (2012).
Feldman Barrett, & E. Smith (Eds.), The mind in con- Asymmetrical generalization of conditioning and
text (pp. 233–258). New York: The Guilford Press. extinction from compound to element and element
Bouton, M. E., & Bolles, R. C. (1979a). Contextual to compound. Journal of Experimental Psychology:
control of the extinction of conditioned fear. Learning Animal Behavior Processes, 38, 381–393.
and Motivation, 10, 445–466. Bouton, M. E., Dunlap, C. M., & Swartzentruber, D.
Bouton, M. E., & Bolles, R. C. (1979b). Role of (1987). Potentiation of taste by another taste during
conditioned contextual stimuli in reinstatement of compound aversion learning. Animal Learning &
extinguished fear. Journal of Experimental Psychology: Behavior, 15, 433–438.
Animal Behavior Processes, 5, 368–378. Bouton, M. E., Jones, D. L., McPhillips, S. A., & Swart-
Bouton, M. E., & Bolles, R. C. (1980). Conditioned fear zentruber, D. (1986). Potentiation and overshadow-
assessed by freezing and by the suppression of three ing in odor-aversion learning: Role of method of
different baselines. Animal Learning & Behavior, 8, odor presentation, the distal-proximal cue distinc-
429–434. tion, and the conditionability of odor. Learning and
Bouton, M. E., & King, D. A. (1983). Contextual Motivation, 17, 115–138.
control of the extinction of conditioned fear: Tests Bouton, M. E., Kenney, F. A., & Rosengard, C. (1990).
for the associative value of the context. Journal of State-dependent fear extinction with two benzodi-
Experimental Psychology: Animal Behavior Processes, 9, azepine tranquilizers. Behavioral Neuroscience, 104,
248–265. 44–55.
Bouton, M. E., & Nelson, J. B. (1994). Context-specific- Bouton, M. E., Mineka, S., & Barlow, D. H. (2001). A
ity of target versus feature inhibition in a feature- modern learning theory perspective on the etiology
negative discrimination. Journal of Experimental of panic disorder. Psychological Review, 108, 4–32.
Psychology: Animal Behavior Processes, 20, 51–65. Bouton, M. E., Nelson, J. B., & Rosas, J. M. (1999).
Bouton, M. E., & Nelson, J. B. (1998a). Mechanisms of Stimulus generalization, context change, and forget-
feature-positive and feature-negative discrimination ting. Psychological Bulletin, 125, 171–186.
learning in an appetitive conditioning paradigm. In Bouton, M. E., Todd, T. P., Vurbic, D., & Winterbauer,
N. A. Schmajuk & P. C. Holland (Eds.), Occasion set- N. E. (2011). Renewal after the extinction of free
ting: Associative learning and cognition in animals (pp. operant behavior. Learning & Behavior, 39, 57–67.
69–112). Washington, DC: American Psychological Bouton, M. E., Woods, A. M., & Pineno, O. (2004).
Association. Occasional reinforced trials during extinction can
Bouton, M. E., & Nelson, J. B. (1998b). The role of slow the rate of rapid reacquisition. Learning and
context in classical conditioning: Some implications Motivation, 35, 371–390.
for cognitive behavior therapy. In W. T. O’Donohue Bower, G. H. (1961). A contrast effect in differential
(Ed.), Learning and behavior therapy (pp. 59–84). conditioning. Journal of Experimental Psychology, 62,
Needham Heights, MA: Allyn & Bacon. 196–199.
Bouton, M. E., & Peck, C. A. (1989). Context effects Bower, G. H., & Hilgard, E. R. (1981). Theories of learn-
on conditioning, extinction, and reinstatement in an ing (5th ed.). Englewood Cliffs, NJ: Prentice and
appetitive conditioning preparation. Animal Learning Hall.
& Behavior, 17, 188–198. Brandon, S. E., & Wagner, A. R. (1991). Modulation of
Bouton, M. E., & Peck, C. A. (1992). Spontaneous a discrete Pavlovian conditioned reflex by a putative
recovery in cross-motivational transfer (countercon- emotive Pavlovian conditioned stimulus. Journal of
ditioning). Animal Learning & Behavior, 20, 313–321. Experimental Psychology: Animal Behavior Processes,
Bouton, M. E., & Ricker, S. T. (1994). Renewal of 17, 299–311.
extinguished responding in a second context. Animal Brandon, S. E., Bombace, J. C., Falls, W. A., & Wagner,
Learning & Behavior, 22, 317–324. A. R. (1991). Modulation of unconditioned de-
Bouton, M. E., & Schepers, S. T. (2015). Renewal after fensive reflexes by a putative emotive Pavlovian
the punishment of free operant behavior. Journal of
488  References

conditioned stimulus. Journal of Experimental Psychol- Burkhardt, P. E., & Ayres, J. J. (1978). CS and US dura-
ogy: Animal Behavior Processes, 17, 312–322. tion effects in one-trial simultaneous fear condition-
Brandon, S. E., Vogel, E. H., & Wagner, A. R. (2000). A ing as assessed by conditioned suppression of lick-
componential view of configural cues in generaliza- ing in rats. Animal Learning & Behavior, 6, 225–230.
tion and discrimination in Pavlovian conditioning. Bush, R. R., & Mosteller, F. (1955). Stochastic models for
Behavioural Brain Research, 110, 67–72. learning. New York: John Wiley and Sons, Inc.
Breland, K., & Breland, M. (1961). The misbehavior of Buss, D. M., Haselton, M. G., Shackelford, T. K.,
organisms. American Psychologist, 16, 681–684. Bleske, A. L., & Wakefield, J. C. (1998). Adaptations,
Briggs, G. E. (1954). Acquisition, extinction, and exaptations, and spandrels. American Psychologist,
recovery functions in retroactive inhibition. Journal 53, 533–548.
of Experimental Psychology, 47, 285–293. Byron, K., & Khazanchi, S. (2012). Rewards and cre-
Briggs, J. F., & Riccio, D. C. (2008). Transfer of old ative performance: A meta-analytic test of theoreti-
‘reactivated’ memory retrieval cues in rats. Learning cally derived hypotheses. Psychological Bulletin, 138,
and Motivation, 39, 13–23. 809–830.
Broadbent, D. E. (1958). Perception and communication. C
New York: Pergamon Press. Cabanac, M. (1971). Physiological role of pleasure.
Broberg, D. J., & Bernstein, I. L. (1987). Candy as Science, 173, 1103–1107.
a scapegoat in the prevention of food aversions Cade, W. H. (1981). Alternative male strategies: Ge-
in children receiving chemotherapy. Cancer, 60, netic differences in crickets. Science, 212, 563–564.
2344–2347. Cameron, J., & Pierce, W. D. (1994). Reinforcement,
Brooks, D. C., & Bouton, M. E. (1993). A retrieval cue reward, and intrinsic motivation: A meta-analysis.
for extinction attenuates spontaneous recovery. Review of Educational Research, 64, 363–423.
Journal of Experimental Psychology: Animal Behavior Camp, D. S., Raymond, G. A., & Church, R. M. (1967).
Processes, 19, 77–89. Temporal relationship between response and
Brooks, D. C., & Bouton, M. E. (1994). A retrieval punishment. Journal of Experimental Psychology, 74,
cue for extinction attenuates response recovery 114–123.
(renewal) caused by a return to the conditioning Campbell, B. A. (1964). Theory and research on the ef-
context. Journal of Experimental Psychology: Animal fects of water deprivation on random activity in the
Behavior Processes, 20, 366–379. rat. In M. J. Wayner (Ed.), Thirst. Oxford: Pergamon.
Brown, J. S., & Jacobs, A. (1949). The role of fear in the Campbell, B. A., & Jaynes, J. (1966). Reinstatement.
motivation and acquisition of responses. Journal of Psychological Review, 73, 478–480.
Experimental Psychology, 39, 747–759. Campbell, B. A., & Sheffield, F. D. (1953). Relation of
Brown, J. S., Kalish, H. I., & Farber, I. E. (1951). Con- random activity to food deprivation. Journal of Com-
ditioned fear as revealed by magnitude of startle parative and Physiological Psychology, 46, 320–322.
response to an auditory stimulus. Journal of Experi- Campbell, B. A., Smith, N. F., Misanin, J. R., & Jaynes,
mental Psychology, 41, 317–327. J. (1966). Species differences in activity during hun-
Brown, M. F. (1992). Does a cognitive map guide ger and thirst. Journal of Comparative and Physiological
choices in the radial-arm maze? Journal of Experimen- Psychology, 61, 123–127.
tal Psychology: Animal Behavior Processes, 18, 56–66. Cantor, M. B., & Wilson, J. F. (1985). Feeding the face:
Brown, M. F., McKeon, D., Curley, T., Weston, B., Lam- New directions in adjunctive behavior research. In F.
bert, C., & Lebowitz, B. (1998). Working memory for R. Brush, & J. B. Overmier (Eds.), Affect, conditioning,
color in honeybees. Animal Learning & Behavior, 26, and cognition: Essays on the determinants of behavior
264–271. (Vol. 299–314). Hillsdale, NJ: Erlbaum.
Brown, M. F., Rish, P. A., Culin, J. E. V., & Edberg, J. Capaldi, E. D., Campbell, D. H., Sheffer, J. D., &
A. (1993). Spatial guidance of choice behavior in the Bradford, J. P. (1987). Conditioned flavor preferences
radial-arm maze. Journal of Experimental Psychology: based on delayed caloric consequences. Journal of
Animal Behavior Processes, 19, 195–214. Experimental Psychology: Animal Behavior Processes,
Brown, P. L., & Jenkins, H. M. (1968). Auto-shaping 13, 150–155.
of the pigeon’s key-peck. Journal of the Experimental Capaldi, E. J. (1964). Effect of N-length, number of
Analysis of Behavior, 11, 1–8. different N-lengths, and number of reinforcements
Brown, R. T., & Wagner, A. R. (1964). Resistance to on resistance to extinction. Journal of Experimental
punishment and extinction following training with Psychology, 68, 230–239.
shock or nonreinforcement. Journal of Experimental Capaldi, E. J. (1967). A sequential hypothesis of
Psychology, 68, 503–507. instrumental learning. In K. W. Spence & J. T. Spence
Buhusi, C. V., & Meck, W. H. (2005). What makes us (Eds.), The psychology of learning and motivation: I (pp.
tick? Functional and neural mechanisms of interval 1–65). New York: Academic Press.
timing. Nature Reviews Neuroscience, 6, 755–765. Capaldi, E. J. (1978). Effects of schedule and delay of
Burish, T. G., Levy, S. M., & Meyerowitz, B. E. (Eds.). reinforcement on acquisition speed. Animal Learning
(1985). Cancer, nutrition and eating behavior: A biobe- & Behavior, 6, 330–334.
havioral perspective. Hillsdale, NJ: Lawrence Erlbaum
Associates, Inc.
References  489

Capaldi, E. J. (1994). The sequential view: From Catania, A. C. (1998). Learning (4th ed.). Upper Saddle
rapidly fading stimulus traces to the organization of River, NJ: Prentice Hall.
memory and the abstract concept of number. Psycho- Catania, A. C., & Reynolds, G. S. (1968). A quantitative
nomic Bulletin & Review, 1, 156–181. analysis of the responding maintained by interval
Capaldi, E. J., & Birmingham, K. M. (1998). Reward schedules of reinforcement. Journal of the Experimen-
produced memories regulate memory-discrimina- tal Analysis of Behavior, 11, 327–383.
tion learning, extinction, and other forms of discrim- Catania, A. C., Matthews, T. J., Silverman, P. J., &
ination learning. Journal of Experimental Psychology: Yohalem, R. (1977). Yoked variable-ration and
Animal Behavior Processes, 24, 254–264. variable-interval responding in pigeons. Journal of
Capaldi, E. J., & Capaldi, E. D. (1970). A discrepancy the Experimental Analysis of Behavior, 28, 155–161.
between anticipated reward and obtained reward Cavoto, K. K., & Cook, R. G. (2001). Cognitive prece-
with no increase in resistance to extinction. Psycho- dence for local information in hierarchical stimulus
nomic Science, 18, 19–21. processing by pigeons. Journal of Experimental Psy-
Capaldi, E. J., & Martins, A. P. G. (2010). Applying chology: Animal Behavior Processes, 27, 3–16.
memories of reinforcement outcomes mainly to Chamizo, V., Sterio, D., & Mackintosh, N. J. (1985).
Pavlovian conditioning. Learning and Motivation, 41, Blocking and overshadowing between intra-maze
187–201. and extra-maze cues: A test of the independence of
Capaldi, E. J., & Spivey, J. E. (1963). Effect of goal box locale and guidance learning, The Quarterly Journal of
similarity on the after-effect of nonreinforcement Experimental Psychology, 37B, 235–253.
and resistance to extinction. Journal of Experimental Changizi, M. A., McGehee, R. M. F., & Hall, W. G.
Psychology, 66, 461–465. (2002). Evidence that appetitive responses for dehy-
Capaldi, E. J., Davidson, T. L., & Myers, D. E. (1981). dration and food-deprivation are learned. Physiology
Resistance to satiation: Reinforcing effects of food and Behavior, 75, 295–304.
and eating under satiation. Learning and Motivation, Chapman, G. B. (1991). Trial order affects cue interac-
12, 171–195. tion in contingency judgment. Journal of Experimental
Carr, J. A. R., & Wilkie, D. M. (1998). Characterization Psychology: Learning, Memory, and Cognition, 17,
of the strategy used by rats in an interval time-place 837–854.
learning task. Journal of Experimental Psychology: Chapman, G. B. (1998). Sooner or later: The psychol-
Animal Behavior Processes, 24, 151–162. ogy of intertemporal choice. In D. L. Medin (Ed.),
Carroll, M. E., Anker, J. J., Mach, J. L., Newman, J. L., The psychology of learning and motivation: Advances in
& Perry, J. L. (2010). Delay discounting as a predic- research and theory. New York: Academic Press.
tor of drug abuse. In G. J. Madden & W. K. Bickel Chapman, G. B., & Robbins, S. J. (1990). Cue interac-
(Eds.), Impulsivity: The behavioral and neurological tion in human contingency judgment. Memory and
science of discounting (pp. 243–271). Washington, DC: Cognition, 18, 537–545.
American Psychological Association. Cheng, K. (1986). A purely geometric module in the
Carter, B. L., & Tiffany, S. T. (1999). Meta-analysis of rat’s spatial representation. Cognition, 23, 149–178.
cue-reactivity in addiction research. Addiction, 94, Cheng, K., & Newcombe, N. S. (2005). Is there a
327–340. geometric module for spatial orientation? Squaring
Castro, L., & Wasserman, E. A. (2014). Pigeons’ track- theory and evidence. Psychonomic Bulletin & Review,
ing of relevant attributes in categorization learning. 12, 1–23.
Journal of Experimental Psychology: Animal Learning Cheng, K., & Spetch, M. L. (1998). Mechanisms of
and Cognition, 40, 195–211. landmark use in mammals and birds. In S. Healy
Castro, L., Kennedy, P. L., & Wasserman, E. A. (2010). (Ed.), Spatial representation in animals (pp. 1–17). New
Conditional same-different discrimination by pi- York: Oxford University Press.
geons: Acquisition and generalization to novel and Cheng, K., Spetch, M. L., & Johnston, M. (1997). Spa-
few-item displays. Journal of Experimental Psychology: tial peak shift and generalization in pigeons. Journal
Animal Behavior Processes, 36, 23–38. of Experimental Psychology: Animal Behavior Processes,
Castro, L., Wasserman, E. A., & Young, M. E. (2012). 23, 469–481.
Variations on variability: Effects of display composi- Cheng, P. W. (1997). From covariation to causation:
tion on same-different discrimination in pigeons. A causal power theory. Psychological Review, 104,
Learning & Behavior, 40, 416–426. 368–405.
Catania, A. C. (1963). Concurrent performances: Cheng, P. W., & Holyoak, K. J. (1995). Complex
A baseline for the study of reinforcement magni- adaptive systems as intuitive statisticians: Causal-
tude. Journal of Experimental Analysis of Behavior, 6, ity, contingency, and prediction. In H. L. Roitblat &
253–253. J.-A. Meyer (Eds.), Comparative approaches to cognitive
Catania, A. C. (1970). Reinforcement schedules and science. Complex adaptive systems (pp. 271–302). Cam-
psychophysical judgments: A study of some tempo- bridge, MA: MIT Press.
ral properties of behavior. In W. N. Schoenfeld (Ed.), Cheng, P. W., & Novick, L. R. (1992). Covariation in
The theory of reinforcement schedules (pp. 1–42). New natural causal induction. Psychological Review, 99,
York: Appleton-Century-Crofts. 365–382.
490  References

Choi, J-S., Cain, C. K., & LeDoux, J. E. (2010). The role Journal of Experimental Psychology: Animal Behavior
of amygdala nuclei in the expression of auditory Processes, 27, 17–29.
signaled two-way active avoidance in rats. Learning Clayton, N. S., Yu, K. S., & Dickinson, A. (2003).
& Memory, 17, 139–147. Interacting cache memories: Evidence for flexible
Christian, K. M., & Thompson, R. F. (2003). Neural memory use by western scrub jays (Aphelocoma
substrates of eyeblink conditioning: Acquisition and californica). Journal of Experimental Psychology: Animal
retention. Learning and Memory, 10, 427–455. Behavior Processes, 29, 14–22.
Chung, S.-H., & Herrnstein, R. J. (1967). Choice and Cobos, P. L., Lopez, F. J., Cano, A., Almaraz, J., &
delay of reinforcement. Journal of the Experimental Shanks, D. R. (2002). Mechanisms of predictive and
Analysis of Behavior, 10, 67–74. diagnostic causal induction. Journal of Experimental
Church, R. M. (1969). Response suppression. In B. Psychology: Animal Behavior Processes, 28, 331–346.
A. Campbell & R. M. Church (Eds.), Punishment Cohen, P. S., Looney, T. A., Campagnoni, F. R., &
and aversive behavior (pp. 111–156). New York: Lawler, C. P. (1985). A two-state model of reinforcer-
Appleton-Century-Crofts. induced motivation. In F. R. Brush & J. B. Overmier
Church, R. M. (1989). Theories of timing behavior. (Eds.), Affect, conditioning, and cognition: Essays on the
In R. R. Mowrer & S. B. Klein (Eds.), Contemporary determinants of behavior (pp. 281–297). Hillsdale, N.J.:
learning theories: Instrumental conditioning theory and Erlbaum.
impact of biological constraints on learning (pp. 41–71). Colagiuri, B., & Lovibond, P. F. (2015). How food cues
Hillsdale, NJ: Erlbaum. can enhance and inhibit motivation to obtain and
Church, R. M., & Broadbent, H. A. (1990). Alternative consume food. Appetite, 84, 79–87.
representations of time, number, and rate. Cognition, Collett, T. S., Cartwright, B. A., & Smith, B. A. (1986).
37, 55–81. Landmark learning and visuospatial memories in
Church, R. M., & Deluty, M. Z. (1977). Bisection of gerbils. Journal of Comparative Physiology A, 170,
temporal intervals. Journal of Experimental Psychol- 435–442.
ogy: Animal Behavior Processes, 3, 216–228. Collier, G. H. (1981). Determinants of choice. Nebraska
Church, R. M., & Gibbon, J. (1982). Temporal gener- Symposium on Motivation, 29, 69–127.
alization. Journal of Experimental Psychology: Animal Collier, G., & Johnson, D. (1997). Motivation as a func-
Behavior Processes, 8, 165–186. tion of animal versus experimenter control. In M. E.
Church, R. M., Lolordo, V. M., Overmier, J. B., Solo- Bouton & M. S. Fanselow (Eds.), Learning, motiva-
mon, R. L., & Turner, L. H. (1966). Cardiac responses tion, and cognition: The functional behaviorism of Robert
to shock in curarized dogs: Effects of shock intensity C. Bolles (pp. 117–129). Washington, DC: American
and duration, warning signal, and prior experience Psychological Association.
with shock. Journal of Comparative and Physiological Collier, G., & Levitsky, D. (1967). Defense of water bal-
Psychology, 62, 1–7. ance in rats: Behavioral and physiological responses
Churchill, M., Remington, B., & Siddle, D. A. (1987). to depletion. Journal of Comparative and Physiological
The effects of context change on long-term habitua- Psychology, 64, 59–67.
tion of the orienting response in humans. Quarterly Collier, G., Hirsch, E., & Hamlin, P. H. (1972). The
Journal of Experimental Psychology B, 39, 315–338. ecological determinants of reinforcement in the rat.
Christianson, J. P., Benison, A. M., Jennings, J., Sands- Physiology and Behavior, 9, 705–716.
mark, E. K., Amat, J., Kaufman, R. D., Baratta, M. V., Collier, G., Hirsch, E., & Kanarek, R. B. (1977). The
Paul, E. D., Campeau, S., Watkins, L. R., Barth, D. operant revisited. In W. K. Honig & J. E. R. Stad-
S., & Maier, S. F. (2008). The sensory insular cortex don (Eds.), Handbook of operant behavior. Englewood
mediates the stress-buffering effects of safety signals Cliffs, NJ: Prentice-Hall.
but not behavioral control. The Journal of Neurosci- Collier, G., Levitsky, D., & Squibb, R. L. (1967). In-
ence, 28, 13703–13711. strumental performance as a function of the energy
Clark, R. E., & Squire, L. R. (1998). Classical condi- content of the diet. Journal of Comparative and Physi-
tioning and brain systems: The role of awareness. ological Psychology, 64, 68–72.
Science, 280, 77–81. Collins, B. N., & Brandon, T. H. (2002). Effects of
Clayton, N. S., & Dickinson, A. (1999). Scrub jays extinction context and retrieval cues on alcohol cue
(Aphelocoma coerulescens) remember the relative time reactivity among nonalcoholic drinkers. Journal of
of caching as well as the location and content of Consulting and Clinical Psychology, 70, 390–397.
their caches. Journal of Comparative Psychology, 113, Colwill, R. M. (1994). Associative representations of
403–416. instrumental contingencies. In G. Bower (Ed.), The
Clayton, N. S., Griffiths, D. P., & Dickinson, A. (2000). psychology of learning and motivation (Vol. 31, pp.
Declarative and episodic-like memory in animals: 1–72). San Diego, CA: Academic Press.
Personal musings of a Scrub Jay. In C. Heyes, & L. Colwill, R. M., & Motzkin, D. K. (1994). Encoding of
Huber (Eds.), The evolution of cognition (pp. 273–288). the unconditioned stimulus in Pavlovian condition-
Cambridge, MA: MIT Press. ing. Animal Learning & Behavior, 22, 384–394.
Clayton, N. S., Yu, K. S., & Dickinson, A. (2001). Scrub Colwill, R. M., & Rescorla R. A. (1985a). Post-
jays (Aphelocoma coerulescens) form integrated mem- conditioning devaluation of a reinforcer affects
ories of the multiple features of caching episodes.
References  491

instrumental responding. Journal of Experimental comparative cognition (pp. 533–551). New York: Ox-
Psychology: Animal Behavior Processes, 11, 120–132. ford University Press.
Colwill, R. M., & Rescorla R. A. (1985b). Instrumental Cook, R. G., Brown, M. F., & Riley, D. A. (1985). Flex-
responding remains sensitive to reinforcer devalua- ible memory processing by rats: Use of prospective
tion after extensive training. Journal of Experimental and retrospective information in the radial maze.
Psychology: Animal Behavior Processes, 11, 520–536. Journal of Experimental Psychology: Animal Behavior
Colwill, R. M., & Rescorla, R. A. (1986). Associative Processes, 11, 453–469.
structures in instrumental learning. In G. H. Bower Cook, R. G., Cavoto, K. K., & Cavoto, B. R. (1995).
(Ed.), The psychology of learning and motivation (Vol. Same–different texture discrimination and concept
20, pp. 55–104). Orlando, FL: Academic Press. learning by pigeons. Journal of Experimental Psychol-
Colwill, R. M., & Rescorla, R. A. (1988). Associations ogy: Animal Behavior Processes, 21, 253–260.
between the discriminative stimulus and the rein- Cook, R. G., Katz, J. S., & Cavoto, B. R. (1997). Pigeon
forcer in instrumental learning. Journal of Experimen- same–different concept learning with multiple
tal Psychology: Animal Behavior Processes, 14, 155–164. stimulus classes. Journal of Experimental Psychology:
Colwill, R. M., & Rescorla R. A. (1990a). Effect of Animal Behavior Processes, 23, 417–433.
reinforcer devaluation on discriminative control of Cook, R. G., Qadri, M. A. J., & Keller, A. M. (2015).
instrumental behavior. Journal of Experimental Psy- The analysis of visual cognition in birds: Implica-
chology: Animal Behavior Processes, 16, 40–47. tions for evolution, mechanism, and representation.
Colwill, R. M., & Rescorla R. A. (1990b). Evidence for Psychology of Learning and Motivation, 63, 173–210.
the hierarchical structure of instrumental learning. Cook, R. G., Levison, D. G., Gillet, S. R., & Blaisdell,
Animal Learning and Behavior, 18, 71–82. A. P. (2005). Capacity and limits of associative
Colwill, R. M., & Triola, S. M. (2002). Instrumental memory in pigeons. Psychonomic Bulletin & Review,
responding remains under the control of the conse- 12, 350–358.
quent outcome after extended training. Behavioural Cook, S. W., & Harris, R. E. (1937). The verbal
Processes, 57, 51–64. conditioning of the galvanic skin reflex. Journal of
Conger, R., & Killeen, P. (1974). Use of concurrent Experimental Psychology, 21, 202–210.
operants in small group research. Pacific Sociological Cooper, R. M., & Zubek, J. P. (1958). Effects of
Review, 17, 399–416. enriched and restricted early environments on the
Conklin, C. A., & Tiffany, S. T. (2002). Applying extinc- learning ability of bright and dull rats. Canadian
tion research and theory to cue-exposure addiction Journal of Psychology, 12, 159–164.
treatments. Addiction. 97, 55-–67. Corbit, L. H., & Balleine B. W. (2005). Double dissocia-
Conway, M. A. (2005). Memory and the self. Journal of tion of basolateral and central amygdala lesions on
Memory and Language, 53, 594–628. the general and outcome-specific forms of Pavlov-
Cook, M., & Mineka, S. (1989). Observational condi- ian-instrumental transfer. Journal of Neuroscience, 25,
tioning of fear to fear-relevant versus fear-irrelevant 962–970.
stimuli in rhesus monkeys. Journal of Abnormal Corbit, L. H., & Balleine, B. W. (2011). The general and
Psychology, 98, 448–459. outcome-specific forms of Pavlovian instrumental
Cook, M., & Mineka, S. (1990). Selective associations transfer are differentially mediated by the nucleus
in the observational conditioning of fear in rhesus accumbens core and shell. Journal of Neuroscience, 31,
monkeys. Journal of Experimental Psychology: Animal 11786–11794.
Behavior Processes, 16, 372–389. Corbit, L. H., Janak, P. H., & Balleine, B. W. (2007).
Cook, M., Mineka, S., & Trumble, D. (1987). The role General and outcome-specific forms of Pavlovian-
of response-produced and exteroceptive feedback in instrumental transfer: The effect of shifts in motiva-
the attenuation of fear over the course of avoidance tional state and inactivation of the ventral tegmental
learning. Journal of Experimental Psychology: Animal area. European Journal of Neuroscience, 26, 3141–3149.
Behavior Processes, 13, 239–249. Corbit, L. H., Nie, H., & Janak, P. H. (2012). Habitual
Cook, R. G. (1992). Dimensional organization and tex- alcohol seeking: Time course and the contribution of
ture discrimination in pigeons. Journal of Experimen- subregions of the dorsal striatum. Biological Psychia-
tal Psychology: Animal Behavior Processes, 18, 354–363. try, 72, 389–395.
Cook, R. G. (Ed.) (2001a). Avian visual cognition [On- Couvillon, P. A., & Bitterman, M. E. (1980). Some
line]. Available: www.pigeon.psy.tufts.edu/avc/ phenomena of associative learning in honeybees.
Cook, R. G. (2001b). Hierarchical stimulus processing Journal of Comparative and Physiological Psychology, 94,
in pigeons. In R. G. Cook (Ed.), Avian visual cogni- 878–885.
tion [On-line]. Available: www.pigeon.psy.tufts. Couvillon, P. A., & Bitterman, M. E. (1984). The
edu/avc/cook/ overlearning-extinction effect and successive nega-
Cook, R. G., & Katz, J. S. (1999). Dynamic object per- tive contrast in honeybees (Apis mellifera). Journal of
ception by pigeons. Journal of Experimental Psychol- Comparative Psychology, 98, 100–109.
ogy: Animal Behavior Processes, 25, 194–210. Couvillon, P. A., & Bitterman, M. E. (1985). Effect of
Cook, R. G., & Wasserman, E. A. (2012). Relational experience with a preferred food on consummatory
discrimination learning in pigeons. In T. R. Zentall responding for a less preferred food in goldfish.
& E. A. Wasserman (Eds.), The Oxford handbook of Animal Learning & Behavior, 13, 433–438.
492  References

Couvillon, P. A., Ablan, C. D., & Bitterman, M. E. Experimental Psychology: Animal Behavior Processes,
(1999). Exploratory studies of inhibitory condition- 25, 3–17.
ing in honeybees (Apis mellifera). Journal of Experi- Crystal, J. D. (2001). Circadian time perception. Journal
mental Psychology: Animal Behavior Processes, 25, of Experimental Psychology: Animal Behavior Processes,
103–112. 27, 68–78.
Couvillon, P. A., Ablan, C. D., Ferreira, T. P., & Bitter- Crystal, J. D. (2003). Nonlinearities in sensitivity to
man, M. E. (2001). The role of nonreinforcement in time: Implications for oscillator-based representa-
the learning of honeybees. The Quarterly Journal of tions of interval and circadian clocks. In W. H. Meck
Experimental Psychology B: Comparative and Physi- (Ed.), Functional and neural mechanisms of interval
ological Psychology, 54B, 127–144. timing (pp. 61–75). Boca Raton, FL: CRC Press.
Couvillon, P. A., Arakaki, L., & Bitterman, M. E. Crystal, J. D. (2012). Sensitivity to time: Implications
(1997). Intramodal blocking in honeybees. Animal for the representation of time. In T. R. Zentall & E. A.
Learning & Behavior, 25, 277–282. Wasserman (Eds.), The Oxford handbook of comparative
Couvillon, P. A., Bumanglag, A. V., & Bitterman, M. E. cognition (pp. 434–450). New York: Oxford Univer-
(2003). Inhibitory conditioning in honeybees. Quar- sity Press.
terly Journal of Experimental Psychology B: Comparative Crystal, J. D., & Smith, A. E. (2014). Binding of
and Physiological Psychology, 56B, 359–370. episodic memories in the rat. Current Biology. 24,
Couvillon, P. A., Campos, A. C., Bass, T. D., & Bitter- 2957–2961.
man, M. E. (2001). Intermodal blocking in honey- Crystal, J. D., Church, R. M., & Broadbent, H. A.
bees. Quarterly Journal of Experimental Psychology B, (1997). Systematic nonlinearities in the memory rep-
4, 369–381. resentation of time. Journal of Experimental Psychol-
Couvillon, P. A., Ferreira, T. P., & Bitterman, M. E. ogy: Animal Behavior Processes, 23, 267–282.
(2003). Delayed alternation in honeybees (Apis mel- Cullen, E. (1957). Adaptations in the kittiwake to cliff-
lifera). Journal of Comparative Psychology, 117, 31–35. nesting. Ibis, 99, 275–302.
Couvillon, P. A., Hsiung, R., Cooke, A. M., & Bitter- Culver, N. C., Vervliet, B., & Craske, M. G. (2015).
man, M. E. (2005). The role of context in the inhibi- Compound extinction: Using the Rescorla-Wagner
tory conditioning of honeybees. Quarterly Journal of model to maximize exposure therapy effects for
Experimental Psychology B: Comparative and Physi- anxiety disorders. Clinical Psychological Science, 3,
ological Psychology, 48B, 59–67. 335–348.
Couvillon, P. A., Klosterhalfen, S., & Bitterman, M. E. Cunningham, C. L. (1979). Alcohol as a cue for extinc-
(1983). Analysis of overshadowing in honeybees. tion: State dependency produced by conditioned
Journal of Comparative Psychology, 97, 154–166. inhibition. Animal Learning & Behavior, 7, 45–52.
Couvillon, P. A., Nagrampa, J. A., & Bitterman, M. E. Cunningham, C. L. (1994). Modulation of ethanol
(1994). Learning in honeybees (Apis mellifera) as a reinforcement by conditioned hyperthermia. Psycho-
function of sucrose concentration: Analysis of the pharmacology, 115, 79–85.
retrospective effect. Journal of Comparative Psychology, D
108, 274–281. Daly, H. B. (1969). Learning of a hurdle-jump response
Craig, A. R., Nevin, J. A., & Odum, A. L. (2014). to escape cues paired with reduced reward or frus-
Behavioral momentum and resistance to change. In trative nonreward. Journal of Experimental Psychology,
F. K. McSweeney & E. S. Murphy (Eds.), The Wiley 79, 146–157.
Blackwell handbook of operant and classical conditioning Daly, H. B., & Daly, J. T. (1982). A mathematical model
(pp. 249–274). West Sussex, UK: John Wiley & Sons. of reward and aversive nonreward: Its application
Craske, M. G., Treanor, M., Conway, C. C., Zbozinek, in over 30 appetitive learning situations. Journal of
T., & Vervliet, B. (2014). Maximizing exposure Experimental Psychology: General, 111, 441–480.
therapy: An inhibitory learning approach. Behaviour Daly, H. B., & Daly, J. T. (1994). Persistence and the
Research and Therapy, 58, 10–23. importance of nonreward: Some applications of
Crawford, L. L., & Domjan, M. (1993). Sexual ap- frustration theory and DMOD. Psychonomic Bulletin
proach conditioning: Omission contingency tests. & Review, 1, 311–317.
Animal Learning & Behavior, 21, 42–50. Daly, M., Delaney, L., Egan, M., & Baumeister, R. F.
Crespi, L. P. (1942). Quantitative variation of incentive (2015). Childhood self-control and unemployment
and performance in the white rat. American Journal of throughout the life span: Evidence from two British
Psychology, 55, 467–517. cohort studies. Psychological Science, 26, 709–723.
Crombag, H. S., & Shaham, Y. (2002). Renewal of drug Daniel, T. O., Stanton, C. M., & Epstein, L. H. (2013).
seeking by contextual cues after prolonged extinc- The future is now: Comparing the effect of episodic
tion in rats. Behavioral Neuroscience, 116, 169–173. future thinking on impulsivity in lean and obese
Crowell, C. R., Hinson, R. E., & Siegel, S. (1981). The individuals. Appetite, 71, 120–125.
role of conditional drug responses in tolerance to the Darwin, C. (1859). On the origin of species by means of
hypothermic effects of ethanol. Psychopharmacologia, natural selection. London: John Murray.
73, 51–54. Darwin, C. (1871). The descent of man and selection in
Crystal, J. D. (1999). Systematic nonlinearities in relation to sex. London: John Murray.
the perception of temporal intervals. Journal of
References  493

Davey, G. C. (1992). Classical conditioning and the de la Mettrie, J. (1912). Man a machine. (French origi-
acquisition of human fears and phobias: A review nal, 1748 ed.). La Salle, IL: Open Court.
and synthesis of the literature. Advances in Behaviour Deci, E. L. (1971). Effects of externally mediated re-
Research and Therapy, 14, 29–66. wards on intrinsic motivation. Journal of Personality
Davidson, T. L. (1993). The nature and function of and Social Psychology, 18, 105–115.
interoceptive signals to feed: Toward integration of Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation
physiological and learning perspectives. Psychologi- and self-determination in human behavior. New York:
cal Review, 100, 640–657. Plenum Press.
Davidson, T. L. (1998). Hunger cues as modulatory Deci, E. L., Koestner, R., & Ryan, R. M. (1999). A
stimuli. In N. A. Schmajuk & P. C. Holland (Eds.), meta-analytic review of experiments examining the
Occasion setting: Associative learning and cognition in effects of extrinsic rewards on intrinsic motivation.
animals (pp. 223–248). Washington, DC: American Psychological Bulletin, 125, 627–668.
Psychological Association. DeCola, J. P., & Fanselow, M. S. (1995). Differential
Davidson, T. L., Aparicio, J., & Rescorla, R. A. (1988). inflation with short and long CS-US intervals: Evi-
Transfer between Pavlovian facilitators and instru- dence of a nonassociative process in long-delay taste
mental discriminative stimuli. Animal Learning and avoidance. Animal Learning & Behavior, 23, 154–163.
Behavior, 16, 285–291. Declercq, M., & DeHouwer, J. (2011). Evidence against
Davis, C. M. (1928). Self-selection of diet by newly an occasion setting account of avoidance learning.
weaned infants: An experimental study. American Learning and Motivation, 42, 46–52.
Journal of Diseases in Children, 36, 651–679. Delamater, A. R. (1996). Effects of several extinction
Davis, M. (1970). Effects of interstimulus interval treatments upon the integrity of Pavlovian stimulus-
length and variability on startle-response habitua- outcome associations. Animal Learning and Behavior,
tion in the rat. Journal of Comparative and Physiological 24, 437–449.
Psychology, 72, 177–192. Delamater, A. R. (1997). Selective reinstatement of
Davis, M. (1992). The role of the amygdala in con- stimulus-outcome associations. Animal Learning &
ditioned fear. In J. P. Aggleton (Ed.), The amygdala: Behavior, 25, 400–412.
Neurobiological aspects of emotion, memory, and mental Delamater, A. R. (2011). At the interface of learning
dysfunction (pp. 255–306). New York: Wiley-Liss. and cognition: An associative learning perspective.
Davison, M., & McCarthy, D. (1988). The matching law: International Journal of Comparative Psychology, 24,
A research review. Hillsdale, NJ: Lawrence Erlbaum 389–411.
Associates, Inc. Delamater, A. R. (2012). On the nature of CS and US
Daw, N. D., Niv, Y., & Dayan, P. (2005). Uncertainty- representations in Pavlovian learning. Learning &
based competition between prefrontal and dorsolat- Behavior, 40, 1–23.
eral striatal systems for behavioral control. Nature Delamater, A. R., & Oakeshott, S. (2007). Learning
Neuroscience, 8, 1704–1711. about multiple attributes of reward in Pavlovian
Dawkins, R. (1986). The blind watchmaker. New York: conditioning. Annals of the New York Academy of Sci-
Norton. ences, 1104, 1–20.
Dawkins, R. (1989). The selfish gene. Oxford: Oxford Denniston, J. C., Savastano, H. I., & Miller, R. R.
University Press. (2001). The extended comparator hypothesis: Learn-
Dawson, M. E., & Schell, A. M. (1985). Information ing by contiguity, responding by relative strength.
processing and human autonomic classical condi- In R. R. Mowrer & S. B. Klein (Eds.), Handbook of
tioning. In P. K. Ackles, J. R. Jennings, & M. G. H. contemporary learning theories (pp. 65–117). Mahwah,
Coles (Eds.), Advances in psychophysiology (Vol. 1, pp. NJ: Erlbaum.
89–165). Greenwich, CT: JAI Press. Descartes, R. (1912). A discourse on method (French
De Houwer, J. (2009). The propositional approach to original, 1637 ed.). London: Dent.
associative learning as an alternative for association Dess, N. K., & Overmier, J. B. (1989). General learned
formation models. Learning & Behavior, 37, 1–20. irrelevance: Proactive effects on Pavlovian condi-
De Houwer, J. (2015). Why a propositional single- tioning in dogs. Learning and Motivation, 20, 1–14.
process model of associative learning deserves to DeVito, P. L., & Fowler, H. (1986). Effects of contin-
be defended. In J. W. Sherman, B. Gawronski, & Y. gency violations on the extinction of a conditioned
Trope (Eds.), Dual processes in social psychology. New fear inhibitor and a conditioned fear excitor. Journal
York: Guilford. of Experimental Psychology: Animal Behavior Processes,
De Houwer, J., & Beckers, T. (2002). A review of recent 12, 99–115.
developments in research and theories on human Dews, P. B. (1970). The theory of fixed-interval
contingency learning. Quarterly Journal of Experimen- responding. In W. N. Schoenfeld (Ed.), The
tal Psychology B, 4, 289–310. theory of reinforcement schedules. New York:
De Houwer, J., Crombez, G., & Baeyens, F. (2005). Appleton-Century-Crofts.
Avoidance behavior can function as a negative occa- Dezfouli, A., & Balleine, B. W. (2012). Habits, action
sion setter. Journal of Experimental Psychology: Animal sequences and reinforcement learning. European
Behavior Processes, 31, 101–106. Journal of Neuroscience, 35, 1036–1051.
494  References

Dickinson, A. (1980). Contemporary animal learning psychology of learning and motivation, Vol. 17 (pp.
theory. Cambridge, England: Cambridge University 215–277). New York: Academic Press.
Press. Domjan, M. (1994). Formulation of a behavior system
Dickinson, A. (1989). Expectancy theory in animal for sexual conditioning. Psychonomic Bulletin &
conditioning. In S. B. Klein & R. R. Mowrer (Eds.), Review, 1, 421–428.
Contemporary learning theories: Pavlovian condition- Domjan, M. (1997). Behavior systems and the demise
ing and the status of traditional learning theory (pp. of equipotentiality: Historical antecedents and
279–308). Hillsdale, N.J.: L. Erlbaum Associates. evidence from sexual conditioning. In M. E. Bouton
Dickinson, A. (1994). Instrumental conditioning. In & M. S. Fanselow (Eds.), Learning, motivation, and
N. J. Mackintosh (Ed.), Animal cognition and learning cognition: The functional behaviorism of Robert C. Bolles.
(pp. 4–79). London: Academic Press. (pp. 31–51). Washington, D.C.: American Psycho-
Dickinson, A., & Balleine, B. W. (1994). Motivational logical Association.
control of goal directed action. Animal Learning and Domjan, M. (1998). Going wild in the laboratory:
Behavior, 22, 1–18. Learning about species typical cues. In D. L. Medin
Dickinson, A., & Balleine, B. W. (2002). The role of (Ed.), The psychology of learning and motivation:
learning in the operation of motivational systems. In Advances in research and theory, Vol. 38 (pp. 155–186).
H. Pashler & R. Gallistel (Eds.), Steven’s handbook of New York: Academic Press, Inc.
experimental psychology (Vol. 3: Learning, motivation, Domjan, M. (2005). Pavlovian conditioning: A func-
and emotion). Hoboken, NJ: John Wiley & Sons. tional perspective. Annual Review of Psychology, 56,
Dickinson, A., & Burke, J. (1996). Within-compound 179–206.
associations mediate the retrospective revaluation of Domjan, M., & Galef, B. G. (1983). Biological con-
causality judgments. Quarterly Journal of Experimen- straints on instrumental and classical conditioning:
tal Psychology B, 1, 60–80. Retrospect and prospect. Animal Learning & Behavior,
Dickinson, A., & Dawson, G. R. (1989). Incentive 11, 151–161.
learning and the motivational control of instrumen- Domjan, M., & Hall, S. (1986). Sexual dimorphism
tal performance. Quarterly Journal of Experimental in the social proximity behavior of Japanese quail
Psychology B, 1, 99–112. (Coturnix coturnix japonica). Journal of Comparative
Dickinson, A., & Dearing, M. F. (1979). Appetitive- Psychology, 100, 68–71.
aversive interactions and inhibitory processes. In Domjan, M., & Hollis, K. L. (1988). Reproductive
A. Dickinson & R. A. Boakes (Eds.), Mechanisms of behavior: A potential model system for adaptive
learning and motivation (pp. 203–231). Hillsdale, NJ: specializations in learning. In R. C. Bolles & M. D.
Lawrence Erlbaum Associates, Inc. Beecher (Eds.), Evolution and learning. (pp. 213–237).
Dickinson, A., & Mulatero, C. W. (1989). Reinforcer Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
specificity of the suppression of instrumental per- Domjan, M., & Purdy, J. E. (1985). Animal research in
formance on a non-contingent schedule. Behavioural psychology: More than meets the eye of the general
Processes, 19, 167–180. psychology student. American Psychologist, 50,
Dickinson, A., Balleine, B., Watt, A., Gonzalez, F., 496–503.
& Boakes, R. A. (1995). Motivational control after Domjan, M., & Wilson, N. E. (1972). Specificity of
extended instrumental learning. Animal Learning & cue to consequence in aversion learning in the rat.
Behavior, 23, 197–206. Psychonomic Science, 26, 143–145.
Dickinson, A., Nicholas, D. J., & Adams, C. D. (1983). Domjan, M., Akins, C., & Vandergriff, D. H. (1992).
The effect of the instrumental training contingency Increased responding to female stimuli as a result of
on susceptibility to reinforcer devaluation. Quarterly sexual experience: Tests of mechanisms of learn-
Journal of Experimental Psychology, 35B, 35–51. ing. Quarterly Journal of Experimental Psychology B, 2,
Dickinson, A., Shanks, D., & Evenden, J. (1984). 139–157.
Judgment of act-outcome contingency: The role of Domjan, M., Blesbois, E., & Williams, J. (1998). The
selective attribution. Quarterly Journal of Experimental adaptive significance of sexual conditioning: Pav-
Psychology A, 36, 29–50. lovian control of sperm release. Psychological Science,
Dimberg, U., & Öhman, A. (1983). The effects of direc- 9, 411–415.
tional facial cues on electrodermal conditioning to Domjan, M., Greene, P., & North, N. C. (1989). Contex-
facial stimuli. Psychophysiology, 20, 160–167. tual conditioning and the control of copulatory
Dinsmoor, J. A. (2001). Stimuli inevitably generated behavior by species-specific sign stimuli in male
by behavior that avoids electric shock are inherently Japanese quail. Journal of Experimental Psychology:
reinforcing. Journal of the Experimental Analysis of Animal Behavior Processes, 15, 147–153.
Behavior, 75, 311–333. Domjan, M., Huber-McDonald, M., & Holloway, K.
Dodwell, P. C., & Humphrey, G. K. (1990). A func- S. (1992). Conditioning copulatory behavior to an
tional theory of the McCollough effect. Psychological artificial object: Efficacy of stimulus fading. Animal
Review, 97, 78–89. Learning & Behavior, 20, 350–362.
Domjan, M. (1983). Biological constraints on instru- Domjan, M., Lyons, R., North, N. C., & Bruell, J.
mental and classical conditioning: Implications for (1986). Sexual Pavlovian conditioned approach
general process theory. In G. H. Bower (Ed.), The behavior in male Japanese quail (Coturnix coturnix
References  495

japonica). Journal of Comparative Psychology, 100, of the conditioned emotional response. Psychonomic
413–421. Science, 18, 145–147.
Domjan, M., Mahometa, M. J., & Matthews, R. N. Dworkin, B. R. (1993). Learning and physiological regula-
(2012). Learning in intimate connections: Condi- tion. Chicago, IL: University of Chicago Press.
tioned fertility and its role in sexual competition. Dwyer, D. M., Bennett, C. H., & Mackintosh, N. J.
Socioaffective Neuroscience & Psychology, 2, 17333. (2001). Evidence for inhibitory associations between
Donahoe, J. W. (1998). Positive reinforcement: The the unique elements of two compound flavours.
selection of behavior. In W. T. O’Donohue (Ed.), Quarterly Journal of Experimental Psychology, 54B,
Learning and behavior therapy (pp. 169–187). Boston, 97–108.
MA: Allyn & Bacon. Dwyer, D. M., Mackintosh, N. J., & Boakes, R. A.
Donahoe, J. W., & Burgos, J. E. (2000). Behavior (1998). Simultaneous activation of the representa-
analysis and revaluation. Journal of the Experimental tions of absent cues results in the formation of an
Analysis of Behavior, 74, 331–346. excitatory association between them. Journal of
Donahoe, J. W., & Palmer, D. C. (1994). Learning and Experimental Psychology: Animal Behavior Processes,
complex behavior. Boston, MA: Allyn and Bacon. 24, 163–171.
Donahoe, J. W., & Vegas, R. (2004). Pavlovian condi- E
tioning: The CS-UR relation. Journal of Experimental Edhouse, W. V., & White, K. G. (1988). Sources of
Psychology: Animal Behavior Processes, 30, 17–33. proactive interference in animal memory. Journal of
Donahoe, J. W., Burgos, J. E., & Palmer, D. C. (1993). A Experimental Psychology: Animal Behavior Processes,
selectionist approach to reinforcement. Journal of the 14, 56–70.
Experimental Analysis of Behavior, 60, 17–40. Edmunds, M. (1974). Defense in animals. Harlow, Es-
Dopson, J. C., Esber, G. R., & Pearce, J. M. (2010). Dif- sex: Longman.
ferences between the associability of relevant and Ehlers, A., Hackmann, A., & Michael, T. (2004).
irrelevant stimuli. Journal of Experimental Psychology: Intrusive re-experiencing in post-traumatic stress
Animal Behavior Processes, 36, 258–267. disorder: Phenomenology, theory, and therapy.
Dopson, J. C., Pearce, J. M., & Haselgrove, M. (2010). Memory, 12, 403–415.
Failure of retrospective revaluation to influence Eibl-Eibesfeldt, I. (1970). Ethology: The biology of behav-
blocking. Journal of Experimental Psychology: Animal ior. New York: Holt Rinehard and Winston.
Behavior Processes, 35, 473–484. Eibl-Eibesfeldt, I. (1979). Human ethology: Concepts
Doyle, T. A., & Samson, H. H. (1988). Adjunctive and implications for the sciences of man. Behavioral
alcohol drinking in humans. Physiology and Behavior, and Brain Sciences, 2, 1–57.
27, 419–431. Eibl-Ebesfeldt, I. (1989). Human ethology. New York:
Duda, J. J., & Bolles, R. C. (1963). Effects of prior Aldine de Gruyter.
deprivation, current deprivation, and weight loss on Eikelboom, R., & Stewart, J. (1982). Conditioning of
the activity of the hungry rat. Journal of Comparative drug-induced physiological responses. Psychological
and Physiological Psychology, 56, 569–571. Review, 89, 507–528.
Dukas, R., & Kamil, A. C. (2001). Limited attention: Eisenberg, M., Kobilo, T., Berman, T. E., & Dudai, Y.
The constraint underlying search image. Behavioral (2003). Stability of retrieved memory: inverse corre-
Ecology, 12, 192–199. lation with trace dominance. Science, 301,1102–1104.
Dunham, P. (1977). The nature of the reinforcing Eisenberger, R. (1992). Learned industriousness. Psy-
stimulus. In W. K. Honig & J. E. Staddon (Eds.), chological Review, 99, 248–267.
Handbook of operant behavior (pp. 98–124). Englewood Eisenberger, R., & Cameron, J. (1996). Detrimental ef-
Cliffs, NJ: Prentice-Hall. fects of reward: Reality or myth? American Psycholo-
Durlach, P. J. (1989). Learning and performance in gist, 51, 1153–1166.
Pavlovian conditioning: Are failures of contiguity Eisenberger, R., Karpman, M., & Trattner, J. (1967).
failures of learning or performance? In S. B. Klein & What is the necessary and sufficient condition for
R. R. Mowrer (Eds.), Contemporary learning theories: reinforcement in the contingency situation? Journal
Pavlovian conditioning and the status of traditional of Experimental Psychology, 74, 342–350.
learning theory (pp. 19–59). Hillsdale, NJ: Lawrence Eisenhardt, D. (2014). Molecular mechanisms under-
Erlbaum Associates, Inc. lying formation of long-term reward memories and
Durlach, P. J., & Rescorla, R. A. (1980). Potentiation extinction memories in the honeybee (Apis mellifera).
rather than overshadowing in flavor-aversion learn- Learning & Memory, 21, 534–542.
ing: An analysis in terms of within-compound as- Eiserer, L. A., & Hoffman, H. S. (1973). Priming of
sociations. Journal of Experimental Psychology: Animal ducklings’ responses by presenting an imprinted
Behavior Processes, 6, 175–187. stimulus. Journal of Comparative and Physiological
Duvarci, S., & Nader, K. (2004). Characterization of Psychology, 82, 345–359.
fear memory reconsolidation. Journal of Neuroscience, Ellen, P., Soteres, B. J., & Wages, C. (1984). Problem
24, 9269–9275. solving in the rat: Piecemeal acquisition of cognitive
Dweck, C. S., & Wagner, A. R. (1970). Situational cues maps. Animal Learning & Behavior, 12, 232–237.
and correlation between CS and US as determinants
496  References

Elliott, M. H. (1928). The effect of change of reward on Estes, W. K. (1950). Toward a statistical theory of
the maze performance of rats. University of California learning. Psychological Review, 57, 94–107.
Publications in Psychology, 4, 19–30. Estes, W. K. (1955). Statistical theory of distributional
Ellison, G. D., & Konorski, J. (1964). Separation of the phenomena in learning. Psychological Review, 62,
salivary and motor responses in instrumental condi- 369–377.
tioning. Science, 146, 1071–1073. Estes, W. K. (1959). Component and pattern models
Epstein, L. H., Dearing, K. K., Roba, L. G., & Finkel- with Markovian interpretation. In R. R. Bush & W.
stein, E. (2010). The influence of taxes and subsidies K. Estes (Eds.), Studies in mathematical learning theory
on energy purchased in an experimental purchasing (pp. 239–263). Stanford, CA: Stanford University
study. Psychological Science, 21, 406–414. Press.
Epstein, L. H., Jankowiak, N., Nederkoorn, C., Estes, W. K., & Skinner, B. F. (1941). Some quantitative
Raynor, H. A., French, S. A., & Finkelstein, E. (2012). properties of anxiety. Journal of Experimental Psychol-
Experimental research on the relation between food ogy, 29, 390–400.
price changes and food-purchasing patterns: A tar- Estle, S. J., Green, L., Myerson, J., & Holt, D. D. (2007).
geted review. American Journal of Clinical Nutrition, Discounting of monetary and directly consumable
95, 789–809. rewards. Psychological Science, 18, 58–63.
Epstein, L. H., Paluch, R., & Coleman, K. J. (1996). Etienne, A. S. (1992). Navigation of a small mammal
Differences in salivation to repeated food cues in by dead reckoning and local cues. Current Directions
obese and nonobese women. Psychosomatic Medicine, in Psychological Science, 1, 48–52.
58, 160–164. Etienne, A. S., Berlie, J., Georgakopoulos, J., & Maurer,
Epstein, L. H., Robinson, J. L., Roemmich, J. N., & R. (1998). Role of dead reckoning in navigation. In
Marusewski, A. (2011). Slow rates of habituation S. Healy (Ed.), Spatial representation in animals (pp.
predict greater zBMI gains over 12 months in lean 54–68). New York: Oxford University Press.
children. Eating Behaviors, 12, 214–218. Everitt, B. J., & Robbins, T. W. (2005). Neural systems
Epstein, L. H., Robinson, J. L., Temple, J. L., Roem- of reinforcement for drug addiction: From actions
mich, J. N., Marusewski, A., & Nadbrzuch, R. (2008). to habits to compulsion. Nature Neuroscience, 8,
Sensitization and habituation of motivated behavior 1481–1489.
in overweight and non-overweight children. Learn- Eysenck, H. J. (1979). The conditioning model of neu-
ing and Motivation, 39, 243–255. rosis. Behavioral and Brain Sciences, 2, 155–199.
Epstein, L. H., Rodefer, J. S., Wisniewski, L., & Cag- F
giula, A. R. (1992). Habituation and dishabituation Falk, J. L. (1961). Production of polydipsia in normal
of human salivary response. Physiology & Behavior, rats by an intermittent food schedule. Science, 133,
51, 945–950. 195.
Epstein, L. H., Saad, F. G., Handley, E. A., Roemmich, Falk, J. L. (1977). The origin and functions of adjunc-
J. N., Hawk, L. W., & McSweeney, F. K. (2003). Ha- tive behavior. Animal Learning and Behavior, 5,
bituation of salivation and motivated responding for 325–335.
food in children. Appetite, 41, 283–289. Falk, J. L., & Kupfer, A. S. (1998). Adjunctive behavior:
Epstein, L. H., Temple, J. L., Roemmich, J. N., & Application to the analysis and treatment of behav-
Bouton, M. E. (2009). Habituation as a determinqant ior problems. In W. O’Donohue (Ed.), Learning and
of human food intake. Psychological Review, 116, behavior therapy (pp. 334–351). Needham Heights,
384–407. MA: Allyn & Bacon.
Epstein, S. (1967). Toward a unified theory of anxiety. Falk, J. L., Samson, H. H., & Winger, G. (1972). Behav-
In B. Maher (Ed.), Progress in experimental personality ioral maintenance of high concentrations of blood
research (pp. 1–89). New York: Academic Press. ethanol and physical dependence in the rat. Science,
Ernst, A. J., Engberg, L., & Thomas, D. R. (1971). On 177, 811–813.
the form of stimulus generalization curves for visual Fanselow, M. S. (1979). Naloxone attenuates rat’s
intensity. Journal of the Experimental Analysis of Behav- preference for signaled shock. Physiological Psychol-
ior, 16, 177–180. ogy, 7, 70–74.
Esber, G. R., & Haselgrove, M. (2011). Reconciling Fanselow, M. S. (1980). Conditional and unconditional
the influence of predictiveness and uncertainty on components of post-shock freezing. Pavlovian Journal
stimulus salience: A model of attention in associa- of Biological Sciences, 15, 177–182.
tive learning. Proceedings of the Royal Society, 278, Fanselow, M. S. (1982). The post-shock activity burst.
2553–2561. Animal Learning and Behavior, 190, 448–454.
Esber, G. R., McGregor, A., Good, M. A., Hayward, A., Fanselow, M. S. (1985). Odors released by stressed
& Pearce, J. M. (2005). Transfer of spatial behaviour rats produce opioid analgesia in unstressed rats.
controlled by a landmark array with a distinctive Behavioral Neuroscience, 99, 589–592.
shape. Quarterly Journal of Experimental Psychology, Fanselow, M. S. (1989). The adaptive function of
58, 69–91. conditioned defensive behavior: An ecological ap-
Espinet, A., Artigas, A. A., & Balleine, B. W. (2008). proach to Pavlovian stimulus-substitution theory.
Inhibitory sensory preconditioning detected with In R. J. Blanchard, P. F. Brain., D. C. Blanchard, & S.
a sodium depletion procedure. Quarterly Journal of Parmigiani (Eds.), Ethoexperimental approaches to the
Experimental Psychology, 61, 240–247.
References  497

study of behavior (pp. 151–166). New York: Kluwer Flagel, S. B., Robinson, T. E., Clark, J. J., Clinton, S. M.,
Academic/Plenum Publishers. Watson, S. J., Seeman, P., Phillips, P. E. M., & Akil,
Fanselow, M. S. (1990). Factors governing one-trial H. (2010). An animal model of genetic vulnerability
contextual conditioning. Animal Learning & Behavior, to behavioral disinhibition and responsiveness to
18, 264–270. reward-related cues: Implications for addiction.
Fanselow, M. S. (1994). Neural organization of the Neuropsychopharmacology, 35, 388–400.
defensive behavior system responsible for fear. Flaherty, C. F. (1985). Animal learning and cognition.
Psychonomic Bulletin and Review, 1, 429–438. New York: Alfred A. Knopf.
Fanselow, M. S., & Baackes, M. P. (1982). Conditioned Flaherty, C. F. (1991). Incentive contrast and selected
fear-induced opiate analgesia on the formalin test: animal models of anxiety. In L. Dachowski & C.
Evidence for two aversive motivational systems. F. Flaherty (Eds.), Current topics in animal learning:
Learning and Motivation, 13, 200–221. Brain, emotion and cognition (pp. 207–243). Hillsdale,
Fanselow, M. S., & Birk, J. (1982). Flavor-flavor as- NJ: Lawrence Erlbaum Associates, Inc.
sociations induce hedonic shifts in taste preference. Flaherty, C. F. (1996). Incentive relativity. New York:
Animal Learning & Behavior, 10, 223–228. Cambridge University Press.
Fanselow, M. S., & Lester, L. S. (1988). A functional Flaherty, C. F., Becker, H. C., & Checke, S. (1983).
behavioristic approach to aversively motivated Repeated successive contrast in consummatory be-
behavior: Predatory imminence as a determinant of havior with repeated shifts in sucrose concentration.
the topography of defensive behavior. In R. C. Bolles Animal Learning & Behavior, 11, 407–414.
& M. D. Beecher (Eds.), Evolution and learning (pp. Flaherty, C. F., Becker, H. C., & Pohorecky, L. (1985).
185–212). Hillsdale, NJ: Lawrence Erlbaum Associ- Correlation of corticosterone elevation and negative
ates, Inc. contrast varies as a function of postshift day. Animal
Fanselow, M. S., & Poulos, A. M. (2005). The neurosci- Learning & Behavior, 13, 309–314.
ence of mammalian associative learning. Annual Flaherty, C. F., Lombardi, B. R., Wrightson, J., &
Review of Psychology, 56, 207–234. Deptula, D. (1980). Conditions under which chlordi-
Fantino, E. (1969). Choice and rate of reinforcement. azepoxide influences gustatory contrast. Psychophar-
Journal of the Experimental Analysis of Behavior, 12, macologia, 67, 269–277.
723–730. Foltin, R. W. (1991). An economic analysis of “de-
Fantino, E. (1977). Conditioned reinforcement: Choice mand” for food in baboons. Journal of the Experimen-
and information. In W. K. Honig & J. E. Staddon tal Analysis of Behavior, 56, 445–454.
(Eds.), Handbook of operant behavior (pp. 313–339). Forestell, C. A., & LoLordo, V. M. (2003). Palatability
Englewood Cliffs, NJ: Prentice Hall. shifts in taste and flavor preference conditioning.
Fedorchak, P. M., & Bolles, R. C. (1987). Hunger Quarterly Journal of Experimental Psychology B, 1,
enhances the expression of calorie- but not taste- 140–160.
mediated conditioned flavor preferences. Journal of Freeman, J. H., & Steinmetz, A. B. (2011). Neural
Experimental Psychology: Animal Behavior Processes, circuitry and plasticity mechanisms underlying
13, 73–79. delay eyeblink conditioning. Learning & Memory, 18,
Fedorchak, P. M., & Bolles, R. C. (1988). Nutritive 666–677.
expectancies mediate cholecystokinin’s suppression- Friedman, B. X., Blaisdell, A. P., Escobar, M., & Miller,
of-intake effect. Behavioral Neuroscience, 102, 451–455. R. R. (1998). Comparator mechanisms and condi-
Fernando, A. B. P., Urcelay, G. P., Mar, A. C., Dickin- tioned inhibition: Conditioned stimulus preexposure
son, A., & Robbins, T. W. (2014). Safety signals as disrupts Pavlovian conditioned inhibition but not
instrumental reinforcers during free-operant avoid- explicitly unpaired inhibition. Journal of Experimental
ance. Learning & Memory, 21, 488–497. Psychology: Animal Behavior Processes, 24, 453–466.
Ferster, C. B., & Skinner, B. F. (1957). Schedules of Funayama, E. S., Couvillon, P. A., & Bitterman, M.
reinforcement. East Norwalk, CT: Appleton- E. (1995). Compound conditioning in honeybees:
Century-Crofts. Blocking tests of the independence assumption.
Fetterman, J. G. (1996). Dimensions of stimulus com- Animal Learning & Behavior, 23, 429–437.
plexity. Journal of Experimental Psychology: Animal G
Behavior Processes, 22, 3–18. Gale, G. D., Anagnostaras, S. G., Godsil, B. P.,
Findley, J. D. (1958). Preference and switching under Mitchell, S., Nozawa, T., Sage, J. R., Wiltgen, B., &
concurrent scheduling. Journal of the Experimental Fanselow, M. S. (2004). Role of the basolateral amyg-
Analysis of Behavior, 1, 123–144. dala in the storage of fear memories across the adult
Fitzsimons, T. J., & Le Magnen, J. (1969). Eating as a lifetime of rats. Journal of Neuroscience, 24, 3810–3815.
regulatory control of drinking in the rat. Journal of Galef, B. G. (1991). A contrarian view of the wisdom
Comparative and Physiological Psychology, 67, 273–283. of the body as it relates to dietary self-selection.
Flagel, S. B., Akil, H., & Robinson, T. E. (2009). Indi- Psychological Review, 98, 218–223.
vidual differences in the attribution of incentive Gallistel, C. R. (1990). The organization of learning.
salience to reward-related cues: Implications for Cambridge, MA: MIT Press.
addiction. Neuropharmacology, 56, 139–148.
498  References

Gallistel, C. R. (1994). Space and time. In N. J. response maintenance and resistance to extinction.
Mackintosh (Ed.), Animal learning and cognition (pp. Animal Learning & Behavior, 6, 209–215.
221–253). San Diego: Academic Press. Gibson, B. M., & Shettleworth, S. J. (2003). Competi-
Gallistel, C. R., & Gibbon, J. (2000). Time, rate, and tion among spatial cues in a naturalistic food-carry-
conditioning. Psychological Review, 107, 289–344. ing task. Learning & Behavior, 31, 143–159.
Garb, J. L., & Stunkard, A. J. (1974). Taste aversions in Gibson, B. M., & Shettleworth, S. J. (2005). Place
man. American Journal of Psychiatry, 131, 1204–1207. versus response learning revisited: Tests of blocking
Garcia, J. (1989). Food for Tolman: Cognition and on the radial maze. Behavioral Neuroscience, 119,
cathexis in concert. In T. Archer & L.-G. Nilsson 567–586.
(Eds.), Aversion, avoidance, and anxiety: Perspectives Gibson, E. J., & Walk, R. D. (1956). The effect of pro-
on aversively motivated behavior (pp. 45–85). Hillsdale, longed exposure to visually presented patterns on
NJ: Lawrence Erlbaum Associates, Inc. learning to discriminate them. Journal of Comparative
Garcia, J., & Koelling, R. A. (1966). Relation of cue to and Physiological Psychology, 49, 239–242.
consequence in avoidance learning. Psychonomic Gibson, E. J., & Wilson, R. D. (1956). The effect of pro-
Science, 4, 123–124. longed exposure to visually presented patterns on
Garcia, J., Ervin, F. R., & Koelling, R. A. (1966). Learn- learning to discriminate them. Journal of Comparative
ing with prolonged delay of reinforcement. Psycho- and Physiological Psychology, 49, 239–242.
nomic Science, 5, 121–122. Gibson, J. J., & Gibson, E. J. (1955). Perceptual learn-
Garcia, J., Hankins, W. G., & Rusiniak, K. W. (1974). ing: Differentiation or enrichment? Psychological
Behavioral regulation of the milieu interne in man Review, 62, 32–41.
and rat. Science, 185, 824–831. Giesen, J. C. A. H., Havermans, R. C., Nederkoorn, C.,
Garcia, J., Kimeldorf, D. J., & Koelling, R. A. (1955). & Jansen, A. (2012). Impulsivity in the supermarket:
Conditioned aversion to saccharin resulting from ex- Responses to calorie taxes and subsidies in healthy
posure to gamma radiation. Science, 122, 157–158. weight undergraduates. Appetite, 58, 6–10.
Gemberling, G. A., & Domjan, M. (1982). Selective Gisquet-Verrier, P., Lynch III, J. F., Cutolo, P., Tole-
associations in one-day-old rats: Taste-toxicosis and dano, D., Ulmen, A., Jasnow, A. M., & Riccio, D. C.
texture-shock aversion learning. Journal of Compara- (2015). Integration of new information with active
tive and Physiological Psychology, 96, 105–113. memory accounts for retrograde amnesia: A chal-
George, D. N., & Pearce, J. M. (1999). Acquired lenge to the consolidation/reconsolidation hypoth-
distinctiveness is controlled by stimulus relevance esis? Journal of Neuroscience, 35, 11623–11633.
not correlation with reward. Journal of Experimental Giurfa, M. (2008). Behavioral and neural analysis of
Psychology: Animal Behavior Processes, 25, 363–373. associate learning in the honey bee. In In J. H. Byrne,
George, D. N., & Pearce, J. M. (2003). Visual search D. Sweatt, R. Menzel, H. Eichenbaum, & H. Roedi-
asymmetry in pigeons. Journal of Experimental Psy- ger (Eds.), Learning and memory: A comprehensive
chology: Animal Behavior Processes, 29, 118–129. reference (Vol. 1, Learning Theory and Behaviour, pp.
George, D. N., & Pearce, J. M. (2012). A configural 561–585). Oxford: Elsevier.
theory of attention and associative learning. Learning Giurfa, M., & Sandoz, J-C. (2012). Invertebrate learn-
& Behavior, 40, 241–254. ing and memory: Fifty years of olfactory condition-
Gershman, S. J., Blei, D. M., & Niv, Y. (2010). Context, ing of the proboscis extension response in honey-
learning, and extinction. Psychological Review, 117, bees. Learning & Memory, 19, 54–66.
197–209. Giurfa, M., Zhang, S., Jennett, A., Menzel, R., & Srini-
Gibbon, J. (1991). Origins of scalar timing. Learning vasan, M. V. (2001). The concepts of ‘sameness’ and
and Motivation, 22, 3–38. ‘difference’ in an insect. Nature, 410, 930–933.
Gibbon, J., & Balsam, P. (1981). Spreading association Glazer, H. I., & Weiss, J. M. (1976). Long-term interfer-
in time. In C. Locurto, H. S. Terrace, & J. Gibbon ence effect: An alternative to “learned helplessness.”
(Eds.), Autoshaping and conditioning theory (pp. Journal of Experimental Psychology: Animal Behavior
219–235). New York: Academic Press. Processes, 2, 202–213.
Gibbon, J., & Church, R. M. (1984). Sources of variance Gleitman, H. (1971). Forgetting of long-term memo-
in an information processing theory of timing. In H. ries in animals. In W. K. Honig & H. James (Eds.),
L. Roitblat, T. G. Bever, & H. S. Terrace (Eds.), Ani- Animal Memory. New York, NY: Academic Press.
mal Cognition (pp. 465–488). Hillsdale, NJ: Erlbaum. Gluck, M. A., & Bower, G. H. (1988). From condition-
Gibbon, J., Baldock, M. D., Locurto, C., Gold, L., & ing to category learning: An adaptive network
Terrace, H. S. (1977). Trial and intertrial durations model. Journal of Experimental Psychology: General,
in autoshaping. Journal of Experimental Psychology: 117, 227–247.
Animal Behavior Processes, 3, 264–284. Gluck, M. A., & Myers, C. E. (1993). Hippocampal me-
Gibbon, J., Church, R. M., & Meck, W. H. (1984). diation of stimulus representation: A computational
Scalar timing in memory. Annals of the New York theory. Hippocampus, 3, 491–516.
Academy of Sciences, 423, 52–77. Goddard, M. J. (1999). The role of US signal value in
Gibbs, C. M., Latham, S. B., & Gormezano, I. (1978). contingency, drug conditioning, and learned help-
Classical conditioning of the rabbit nictitating mem- lessness. Psychonomic Bulletin & Review, 6, 412–423.
brane response: Effects of reinforcement schedule on
References  499

Godden, D. R., & Baddeley, A. D. (1975). Context- Gould, S. J., & Vrba, E. S. (1982). Exaptation: A missing
dependent memory in two natural environments: term in the science of form. Paleobiology, 8, 4–15.
On land and underwater. British Journal of Psychol- Grace, R. C. (1994). A contextual model of concurrent-
ogy, 66, 325–331. chains choice. Journal of the Experimental Analysis of
Godden, D., & Baddeley, A. (1980). When does context Behavior, 61, 113–129.
influence recognition memory? British Journal of Graham, M., Good, M. A., McGregor, A., & Pearce,
Psychology, 71, 99–104. J. M. (2006). Spatial learning based on the shape of
Gonzalez, F., Quinn, J. J., & Fanselow, M. S. (2003). the environment is influenced by properties of the
Differential effects of adding and removing compo- objects forming the shape. Journal of Experimental
nents of a context on the generalization of condition- Psychology: Animal Behavior Processes, 32, 44–59.
al freezing. Journal of Experimental Psychology: Animal Grand, C., & Honey, R. C. (2008). Solving XOR. Journal
Behavior Processes, 29, 78–83. of Experimental Psychology: Animal Behavior Processes,
Gonzalez, R. C., Gentry, G. V., & Bitterman, M. E. 34, 486–493.
(1954). Relational discrimination of intermediate Grant, D. S. (1975). Proactive interference in pigeon
size in the chimpanzee. Journal of Comparative and short-term memory. Journal of Experimental Psychol-
Physiological Psychology, 47, 385–388. ogy: Animal Behavior Processes, 1, 207–220.
Goodyear, A. J., & Kamil, A. C. (2004). Clark’s nut- Grant, D. S. (1976). Effect of sampling presentation
crackers (Nucifraga columbiana) and the effects of time in long-delay matching in the pigeon. Learning
goal-landmark distance on overshadowing. Journal and Motivation, 7, 580–590.
of Comparative Psychology, 118, 258–264. Grant, D. S., & Roberts, W. A. (1973). Trace interaction
Gordon, W. C., & Weaver, M. S. (1989). Cue-induced in pigeon short-term memory. Journal of Experimental
transfer of CS preexposure effects across contexts. Psychology, 101, 21–29.
Animal Learning & Behavior, 17, 409–417. Grant, P. R., & Grant, B. R. (2002). Unpredictable
Gordon, W. C., McCracken, K. M., Dess-Beech, N., & evolution in a 30-year study of Darwin’s finches.
Mowrer, R. R. (1981). Mechanisms for the Cueing Science, 296, 707–711
Phenomenon: The Addition of the Cueing Context Green, L., & Freed, D. E. (1998). Behavioral econom-
to the Training Memory. Learning and Motivation, 12, ics. In W. T. O’Donohue (Ed.), Learning and behavior
196–211. therapy (pp. 274–300). Needham Heights, MA: Allyn
Gordon, W. C., Smith, G. J., & Katz, D. S. (1979). Dual and Bacon.
effects of response blocking following avoidance Green, L., & Myerson, J. (2013). How many impul-
learning. Behaviour Research and Therapy, 17, 479–487. sivities? A discounting perspective. Journal of the
Gordon, W. C., Taylor, J. R., & Mowrer, R. R. (1981). Experimental Analysis of Behavior, 99, 3–13.
Enhancement of short-term retention in rats with Green, L., Fisher, E. B., Perlow, S., & Sherman, L.
pretest cues: Effects of the training-cueing interval (1981). Preference reversal and self control: Choice
and the specific cueing treatment. American Journal as a function of reward amount and delay. Behaviour
of Psychology, 94, 309–322. Analysis Letters, 1, 43–51.
Gormezano, I., Kehoe, E. J., & Marshall, B. S. (1983). Grice, G. R. (1948). The relation of secondary rein-
Twenty years of classical conditioning research with forcement to delayed reward in visual discrimina-
the rabbit. Progress in Psychobiology and Physiological tion learning. Journal of Experimental Psychology, 38,
Psychology, 10, 197–275. 1–16.
Gormezano, I., Prokasy, W. F., & Thompson, R. F. Grill, H. J., & Norgren, R. (1978). The taste reactiv-
(Eds.). (1987). Classical conditioning (3rd ed.). Hills- ity test: I. Mimetic responses to gustatory stimuli
dale, NJ: Lawrence Erlbaum Associates, Inc. in neurologically normal rats. Brain Research, 143,
Gottlieb, G. (1965). Imprinting in relation to paren- 263–279.
tal and species identification by avian neonates. Grosch, J., & Neuringer, A. (1981). Self-control in
Journal of Comparative and Physiological Psychology, 59, pigeons under the Mischel paradigm. Journal of the
345–356. Experimental Analysis of Behavior, 35, 3–21.
Gould, J. L. (1984). Natural history of honeybee learn- Grossen, N. E., & Kelley, M. J. (1972). Species-specific
ing. In P. Marler & H. S. Terrace (Eds.), The biology of behavior and acquisition of avoidance in rats.
learning (pp. 150–180). Berlin: Springer. Journal of Comparative and Physiological Psychology, 81,
Gould, J. L. (1996). Specializations in honeybee learn- 307–310.
ing. In C. F. Moss & S. J. Shettleworth (Eds.), Neu- Groves, P. M., & Thompson, R. F. (1970). Habitua-
roethological studies of cognitive and perceptual processes tion: A dual-process theory. Psychological Review, 77,
(pp. 11–30). Princeton, NJ: Princeton University 419–450.
Press. Guthrie, E. R. (1935). The psychology of learning. New
Gould, J. L., & Marler, P. (1987). Learning by instinct. York: Harper.
Scientific American, 256, 74–85 Guthrie, E. R., & Horton, G. P. (1946). Cats in a puzzle
Gould, S. J. (1991). Exaptation: A crucial tool for an box. New York: Rinehart.
evolutionary psychology. Journal of Social Issues, 47, Guttman, N., & Kalish, H. I. (1956). Discriminability
43–65. and stimulus generalization. Journal of Experimental
Psychology, 51, 79–88.
500  References

H Hamlin, A. S., Newby, J., & McNally, G. P. (2007). The


Hagenaars, M. A., Oitzl, M., & Roelofs, K. (2014). Up- neural correlates and role of D1 dopamine receptors
dating freeze: Aligning animal and human research. in renewal of extinguished alcohol-seeking. Neuro-
Neuroscience and Biobehavioral Reviews, 47, 165–176. science, 146, 525–536.
Hailman, J. P. (1967). The ontogeny of an instinct. Hamm, J., Matheson, W. R., & Honig, W. K. (1997).
Behaviour Supplement, 15, 1–159. Mental rotation in pigeons (Columbia livia). Journal of
Hall, G. (1991). Perceptual and associative learning. Comparative Psychology, 111, 76–81.
Oxford: Clarendon Press. Hammond, L. J. (1980). The effect of contingency
Hall, G. (1996). Learning about associatively activated upon the appetitive conditioning of free-operant be-
stimulus representations: Implications for acquired havior. Journal of the Experimental Analysis of Behavior,
equivalence and perceptual learning. Animal Learn- 34, 297–304.
ing and Behavior, 24, 233–255. Hampton, R. R. (2001). Rhesus monkeys know when
Hall, G., & Channell, S. (1985). Differential effects they remember. Proceedings of the National Academy of
of contextual change on latent inhibition and on Science, 98, 5359–5362.
the habituation of an orienting response. Journal of Hanson, H. M. (1959). Effects of discrimination train-
Experimental Psychology: Animal Behavior Processes, ing on stimulus generalization. Journal of Experimen-
11, 470–481. tal Psychology, 58, 321–334.
Hall, G., & Pearce, J. M. (1979). Latent inhibition of a Hardt, O., Nader, K., & Nadel, L. (2013). Decay hap-
CS during CS-US pairings. Journal of Experimental pens: the role of active forgetting in memory. Trends
Psychology: Animal Behavior Processes, 5, 31–42. in Cognitive Sciences 17, 111–120.
Hall, G., & Pearce, J. M. (1982). Restoring the asso- Hardt, O., Wang, S-H., & Nader, K. (2009). Storage or
ciability of a pre-exposed CS by a surprising event. retrieval deficit: The yin and yang of amnesia. Learn-
Quarterly Journal of Experimental Psychology B, 3, ing & Memory, 16, 224–230.
127–140. Harris, J. A. (2011). The acquisition of conditioned re-
Hall, G., & Rodríguez, G. (2010). Associative and non- sponding. Journal of Experimental Psychology: Animal
associative processes in latent inhibition: An elabo- Behavior Processes, 37, 151–164.
ration of the Pearce-Hall model. In R. E. Lubow & I. Harris, J. A., & Livesey, E. J. (2010). An attention-
Weinder (Eds.), Latent inhibition: Data, theories, and modulated associative network. Learning & Behavior,
applications to schizophrenia (pp. 114–36). Cambridge, 38, 1–26.
UK: Cambridge University Press. Harris, J. A., Andrew, B. J., & Kwok, D. W. S. (2013).
Hall, G., & Schachtman, T. R. (1987). Differential ef- Magazine approach during a signal for food de-
fects of a retention interval on latent inhibition and pends on Pavlovian, not instrumental, conditioning.
the habituation of an orienting response. Animal Journal of Experimental Psychology: Animal Behavior
Learning and Behavior, 15, 76–82. Processes, 39, 107–116.
Hall, G., Channell, S., & Schachtman, T. R. (1987). The Harris, J. A., Livesey, E. J., Gharaei, S., & Westbrook,
instrumental overshadowing effect in pigeons: The R. F. (2008). Negative patterning is easier than a
role of response bursts. Quarterly Journal of Experi- biconditional discrimination. Journal of Experimental
mental Psychology B, 39, 173–188. Psychology: Animal Behavior Processes, 34, 494–500.
Hall, G., Prados, J., & Sansa, J. (2005). Modulation Haselgrove, M. (2010). Reasoning rats or associative
of the effective salience of a stimulus by direct and animals? A common-element analysis of the effects
associative activation of its representation. Journal of additive and subadditive pretraining on blocking.
of Experimental Psychology: Animal Behavior Processes, Journal of Experimental Psychology: Animal Behavior
31, 267–276. Processes, 36, 296–306.
Hall, W. G., Arnold, H. M., & Myers, K. P. (2000). The Haselgrove, M., Aydin, A., & Pearce, J. M. (2004). A
acquisition of an appetite. Psychological Science, 11, partial reinforcement extinction effect despite equal
101–105. rates of reinforcement during Pavlovian condi-
Hamilton, D. A., Akers, K. G., Weisend, M. P., & tioning. Journal of Experimental Psychology: Animal
Sutherland, R. J. (2007). How do room and appara- Behavior Processes, 30, 240–250.
tus cues control navigation in the Morris water task? Haselgrove, M., Esber, G. R., Pearce, J. M., & Jones,
Evidence for distinct contributions to a movement P. M. (2010). Two kinds of attention in Pavlovian
vector. Journal of Experimental Psychology: Animal conditioning: Evidence for a hybrid model of learn-
Behavior Processes, 33, 100–114. ing. Journal of Experimental Psychology: Animal
Hamilton, D. A., Akers, K. G., Johnson, T. E., Rice, J. Behavior Processes, 36, 456–470.
P., Candalaria, F. T., Sutherland, R. J., Weisend, M. Hayward, A., Good, M. A., & Pearce, J. M. (2004). Fail-
P., & Redhead, E. S. (2008). The relative influence of ure of a landmark to restrict spatial learning based
place and direction in the Morris water task. Journal on the shape of the environment. Quarterly Journal of
of Experimental Psychology: Animal Behavior Processes, Experimental Psychology, 57, 289–314.
34, 31–53. Healy, A. F., Kosslyn, S. M., & Shiffrin, R. M. (1992).
Hamilton, W. D. (1964). The genetical theory of social Essays in honor of William K. Estes, Vol. 1: From
behavior: I and II. Journal of Theoretical Biology, 7, learning theory to connectionist theory; Vol. 2: From
1–52.
References  501

learning processes to cognitive processes. Hillsdale, NJ: Herrnstein, R. J., Loveland, D. H., & Cable, C. (1976).
Lawrence Erlbaum Associates, Inc. Natural concepts in pigeons. Journal of Experimental
Hearst, E., & Franklin, S. R. (1977). Positive and Psychology: Animal Behavior Processes, 2, 285–302.
negative relations between a signal and food: Heth, C. D. (1976). Simultaneous and backward fear
Approach-withdrawal behavior to the signal. Journal conditioning as a function of number of CS-UCS
of Experimental Psychology: Animal Behavior Processes, pairings. Journal of Experimental Psychology: Animal
3, 37–52. Behavior Processes, 2, 117–129.
Hearst, E., & Jenkins, H. M. (1974). Sign-tracking: The Hetherington, M. M., & Rolls, B. J. (1996). Sensory-
stimulus-reinforcer relation and directed action. Austin, specific satiety: Theoretical frameworks and
TX: The Psychonomic Society. central characteristics. In E. D. Capaldi (Ed.), Why
Hendersen, R. W. (1978). Forgetting of conditioned we eat what we eat: The psychology of eating (pp.
fear inhibition. Learning and Motivation, 9, 16–30. 267–290). Washington, DC: American Psychological
Hendersen, R. W. (1985). Fearful memories: The moti- Association.
vational significance of forgetting. In F. R. Brush & J. Hicks, L. H. (1964). Effects of overtraining on acquisi-
B. Overmier (Eds.), Affect, conditioning and cognition: tion and reversal of place and response learning.
Essays on the determinants of behavior. Hillsdale, NJ: Psychological Reports, 15, 459–462.
Lawrence Erlbaum Associates, Inc. Higgins, S. T., Silverman, K., & Heil, S. H. (Eds.).
Hendersen, R. W., & Graham, J. (1979). Avoidance of (2008). Contingency management in substance abuse
heat by rats: Effects of thermal context on rapidity of treatment. New York, NY: Guilford Press.
extinction. Learning and Motivation, 10, 351–363. Higgins, S. T., Washio, Y., Heil, S. H., Solomon, L. J.,
Hendersen, R. W., Patterson, J. M., & Jackson, R. L. Gaalema, D. E., Higgins, T. M., & Bernstein, I. M.
(1980). Acquisition and retention of control of instru- (2012). Financial incentives for smoking cessation
mental behavior by a cue-signaling airblast: How among pregnant and newly postpartum women.
specific are conditioned anticipations? Learning and Preventive Medicine, 55, S33–S40.
Motivation, 11, 407–426. Hinde, R. A. (1966). Animal behavior: A synthesis of
Herbranson, W. T., Fremouw, T., & Shimp, C. P. (1999). ethology and comparative psychology. London: Aca-
The randomization procedure in the study of cat- demic Press.
egorization of multidimensional stimuli by pigeons. Hinde, R. A. (1970). Animal behavior (2nd ed.). New
Journal of Experimental Psychology: Animal Behavior York: McGraw-Hill.
Processes, 25, 113–125. Hinde, R. A., & Stevenson-Hinde, J. (1973). Constraints
Hermer, L., & Spelke, E. S. (1994). A geometric process on learning: Limitations and predispositions. London:
for spatial reorientation in young children. Nature, Academic Press.
370, 57–59. Hineline, P. N. (2001). Beyond the molar-molecular
Herrnstein, R. J. (1961). Relative and absolute strength distinction: We need multiscaled analyses. Journal of
of response as a function of frequency of reinforce- the Experimental Analysis of Behavior, 75, 342–347.
ment. Journal of the Experimental Analysis of Behavior, Hinson, R. E., Poulos, C. X., & Cappell, H. (1982).
4, 267–272. Effects of pentobarbital and cocaine in rats expecting
Herrnstein, R. J. (1969). Method and theory in the pentobarbital. Pharmacology, Biochemistry & Behavior,
study of avoidance. Psychological Review, 76, 49–69. 16, 661–666.
Herrnstein, R. J. (1970). On the law of effect. Journal of Hirsch, J. (1963). Behavior genetics and individuality
the Experimental Analysis of Behavior, 13, 243–266. understood. Science, 142, 1436–1442.
Herrnstein, R. J. (1971). Quantitative hedonism. Jour- Hirsch, S. M., & Bolles, R. C. (1980). On the ability of
nal of Psychiatric Research, 8, 399–412. prey to recognize predators. Zeitschrift fur Tierpsy-
Herrnstein, R. J., & deVilliers, P. A. (1980). Fish as a chologie, 54, 71–84.
natural category for people and pigeons. In G. H. Hobbes, T. (1650). Human nature. London.
Bower (Ed.), The psychology of learning and motivation Hoffman, H. S., & Ratner, A. M. (1973a). A reinforce-
(Vol. 14, pp. 59–95). New York: Academic Press. ment model of imprinting: Implications for social-
Herrnstein, R. J., & Hineline, P. N. (1966). Negative ization in monkeys and men. Psychological Review,
reinforcement as shock frequency reduction. Journal 80, 527–544.
of the Experimental Analysis of Behavior, 9, 421–430. Hoffman, H. S., & Ratner, A. M. (1973b). Effects of
Herrnstein, R. J., & Loveland, D. H. (1964). Complex stimulus and environmental familiarity on visual
visual concept in the pigeon. Science, 146, 549–551. imprinting in newly hatched ducklings. Journal of
Herrnstein, R. J., & Prelec, D. (1992). Melioration. In Comparative & Physiological Psychology, 85, 11–19.
G. Loewenstein & J. Elster (Eds.), Choice over time Hoffman, H. S., Eiserer, L. A., & Singer, D. (1972).
(pp. 235–263). New York: Russell Sage. Acquisition of behavioral control by a stationary im-
Herrnstein, R. J., & Vaughan, W. (1980). Melioration printing stimulus. Psychonomic Science, 26, 146–148.
and behavioral allocation. In J. E. Staddon (Ed.), Hoffman, H. S., Ratner, A. M., & Eiserer, L. A. (1972).
Limits to action: The allocation of individual behavior Role of visual imprinting in the emergence of
(pp. 143–176). New York: Academic Press. specific filial attachments in ducklings. Journal of
Comparative & Physiological Psychology, 81, 399–409.
502  References

Hoffman, H. S., Selekman, W., & Fleshler, M. (1966). (Eds.), Brain organization and memory: Cells, systems,
Stimulus aspects of aversive controls: Long term and circuits (pp. 78–104). New York: Oxford Univer-
effects of suppression procedures. Journal of the sity Press.
Experimental Analysis of Behavior, 9, 659–662. Holland, P. C. (1992). Occasion setting in Pavlovian
Hogarth, L. (2012). Goal-directed and transfer-cue- conditioning. In G. Bower (Ed.), The psychology of
elicited drug-seeking are dissociated by pharma- learning and motivation (Vol. 28, pp. 69–125). Orlando,
cotherapy: Evidence for independent additive FL: Academic Press.
controllers. Journal of Experimental Psychology: Animal Holland, P. C. (1999). Overshadowing and blocking as
Behavior Processes, 38, 266–278. acquisition deficits: No recovery after extinction of
Hogarth, L., & Chase, H. W. (2011). Parallel goal-di- overshadowing or blocking cues. Quarterly Journal of
rected and habitual control of human drug-seeking: Experimental Psychology B: Comparative and Physi-
Implications for dependence variability. Journal of ological Psychology, 52b, 307–333.
Experimental Psychology: Animal Behavior Processes, Holland, P. C. (2000). Trial and intertrial durations in
37, 261–276. appetitive conditioning in rats. Animal Learning &
Hogarth, L., Dickinson, A., & Duka, T. (2010). The Behavior, 28, 121–135.
associative basis of cue elicited drug taking in hu- Holland, P. C. (2004). Relations between Pavlovian-
mans. Psychopharmacology, 208, 337–351. instrumental transfer and reinforcer devaluation.
Hogarth, L., Dickinson, A., Austin, A., Brown, C., & Journal of Experimental Psychology: Animal Behavior
Duka, T. (2008). Attention and expectation in human Processes, 30, 104–117.
predictive learning: The role of uncertainty. Quarter- Holland, P. C. (2005). Amount of training effects in
ly Journal of Experimental Psychology, 61, 1658–1668. representation-mediated food aversion learning: No
Holland, P. C. (1977). Conditioned stimulus as a de- evidence of a role for associability changes. Learning
terminant of the form of the Pavlovian conditioned & Behavior, 33, 464–478.
response. Journal of Experimental Psychology: Animal Holland, P. C. (2008). Cognitive versus stimulus-
Behavior Processes, 3, 77–104. response theories of learning. Learning & Behavior,
Holland, P. C. (1979a). Differential effects of omission 36, 227–241.
contingencies on various components of Pavlovian Holland, P. C., & Rescorla, R. A. (1975a). Second-order
appetitive conditioned responding in rats. Journal of conditioning with food unconditioned stimulus.
Experimental Psychology: Animal Behavior Processes, 5, Journal of Comparative and Physiological Psychology, 88,
178–193. 459–467.
Holland, P. C. (1979b). The effects of qualitative and Holland, P. C., & Rescorla, R. A. (1975b). The effect of
quantitative variation in the US on individual com- two ways of devaluing the unconditioned stimulus
ponents of Pavlovian appetitive conditioned behav- after first- and second-order appetitive conditioning.
ior in rats. Animal Learning & Behavior, 7, 424–432. Journal of Experimental Psychology: Animal Behavior
Holland, P. C. (1984). Differential effects of reinforce- Processes, 1, 355–363.
ment of an inhibitory feature after serial and simul- Holland, P. C., & Straub, J. J. (1979). Differential effects
taneous feature negative discrimination training. of two ways of devaluing the unconditioned stimu-
Journal of Experimental Psychology: Animal Behavior lus after Pavlovian appetitive conditioning. Journal
Processes, 10, 461–475. of Experimental Psychology: Animal Behavior Processes,
Holland, P. C. (1985). The nature of conditioned inhi- 5, 65–78.
bition in serial and simultaneous feature negative Hollis, K. L. (1982). Pavlovian conditioning of signal-
discrimination. In R. R. Miller & N. E. Spear (Eds.), centered action patterns and autonomic behavior: A
Information processing in animals: Conditioned inhibi- biological analysis of function. In R. A. Rosenblatt,
tion (pp. 267–298). Hillsdale, NJ: Lawrence Erlbaum R. A. Hinde, C. Beer, & M. C. Busnel (Eds.), Advances
Associates, Inc. in the study of behavior, Vol. 12 (pp. 1–64). New York:
Holland, P. C. (1986). Transfer after serial feature posi- Academic Press.
tive discrimination training. Learning and Motivation, Hollis, K. L. (1984). The biological function of Pavlov-
17, 243–268. ian conditioning: The best defense is a good offense.
Holland, P. C. (1989a). Occasion setting with simul- Journal of Experimental Psychology: Animal Behavior
taneous compounds in rats. Journal of Experimental Processes, 10, 413–425.
Psychology: Animal Behavior Processes, 15, 183–193. Hollis, K. L. (1990). The role of Pavlovian conditioning
Holland, P. C. (1989b). Transfer of negative occa- in territorial aggression and reproduction. In D. A.
sion setting and conditioned inhibition across Dewsbury (Ed.), Contemporary issues in comparative
conditioned and unconditioned stimuli. Journal of psychology (pp. 197–219). Sunderland, MA: Sinauer
Experimental Psychology: Animal Behavior Processes, Associates, Inc.
15, 311–328. Hollis, K. L. (1997). Contemporary research on Pav-
Holland, P. C. (1990a). Event representation in Pavlov- lovian conditioning: A “new” functional analysis.
ian conditioning: Image and action. Cognition, 37, American Psychologist, 52, 956–965.
105–131. Hollis, K. L., Cadieux, E. L., & Colbert, M. M. (1989).
Holland, P. C. (1990b). Forms of memory in Pavlov- The biological function of Pavlovian conditioning: A
ian conditioning. In N. M. Weinberger, & G. Lynch mechanism for mating success in the blue gourami
References  503

(Trichogaster trichopterus). Journal of Comparative Hull, C. L. (1931). Goal attraction and directing ideas
Psychology, 103, 115–121. conceived as habit phenomena. Psychological Review,
Hollis, K. L., Pharr, V. L., Dumas, M. J., Britton, G. 38, 487–506.
B., & et al. (1997). Classical conditioning provides Hull, C. L. (1943). Principles of behavior: An in-
paternity advantage for territorial male blue goura- troduction to behavior theory. New York:
mis (Trichogaster trichopterus). Journal of Comparative Appleton-Century-Crofts.
Psychology, 111, 219–225. Hull, C. L. (1952). A behavior system: An introduction
Holmes, N. M., Marchand, A. R., & Coutureau, to behavior theory concerning the individual organism.
E. (2010). Pavlovian to instrumental transfer: A New Haven, CT: Yale University Press.
neurobehavioural perspective. Neuroscience and Hulse, S. H., Fowler, H., & Honig, W. K. (Eds.). (1978).
Biobehavioral Reviews, 34, 1277–1295. Cognitive processes in animal behavior. Hillsdale, NJ:
Holyoak, K. J., & Cheng, P. W. (2011). Causal learning Lawrence Erlbaum Associates.
and inference as a rational process: The new synthe- Hulse, S. H., Jr. (1958). Amount and percentage of
sis. Annual Review of Psychology, 62, 135–163. reinforcement and duration of goal confinement in
Honey, R. C., & Hall, G. (1989). Acquired equivalence conditioning and extinction. Journal of Experimental
and distinctiveness of cues. Journal of Experimental Psychology, 56, 48–57.
Psychology: Animal Behavior Processes, 15, 338–346. Hume, D. (1739). A treatise of human nature.
Honey, R. C., & Hall, G. (1991). Acquired equivalence Hursh, S. R. (1980). Economic concepts for the analy-
and distinctiveness of cues using a sensory-precon- sis of behavior. Journal of the Experimental Analysis of
ditioning procedure. Quarterly Journal of Experimen- Behavior, 34, 219–238.
tal Psychology, 43B, 121–135. Hursh, S. R., & Natelson, B. H. (1981). Electrical brain
Honey, R. C., & Watt, A. (1998). Acquired relational stimulation and food reinforcement dissociated
equivalence: Implications for the nature of associa- by demand elasticity. Physiology and Behavior, 26,
tive structures. Journal of Experimental Psychology: 509–515.
Animal Behavior Processes, 24, 325–334. Hursh, S. R. (2014). Behavioral economics and the
Honey, R. C., Good, M., & Manser, K. L. (1998). Nega- analysis of consumption and choice. In F. K. Mc-
tive priming in associative learning: Evidence from a Sweeney & E. S. Murphy (Eds.), The Wiley Blackwell
serial-habituation procedure. Journal of Experimental handbook of operant and classical conditioning (pp.
Psychology: Animal Behavior Processes, 24, 229–237. 275–305). West Sussex, UK: John Wiley & Sons.
Honig, W. K. (1962). Prediction of preference, transpo- Hurwitz, H. M., & Roberts, A. E. (1977). Aversively
sition, and transposition-reversal from the general- controlled behavior and the analysis of conditioned
ization gradient. Journal of Experimental Psychology, suppression. In H. Davis & H. M. Hurwitz (Eds.),
64, 239–248. Operant-Pavlovian interactions (pp. 189–224). Hills-
Honig, W. K., Boneau, C. A., Burstein, K. R., & Penny- dale, NJ: Lawrence Erlbaum Associates, Inc.
packer, H. S. (1963). Positive and negative general- Hutcheson, D. M., Everitt, B. J., Robbins, T. W., &
ization gradients obtained under equivalent training Dickinson, A. (2001). The role of withdrawal in
conditions. Journal of Comparative and Physiological heroin addiction: Enhances reward or promotes
Psychology, 56, 111–116. avoidance? Nature Neuroscience, 4, 943–947.
Horne, M. R., & Pearce, J. M. (2010). Conditioned I
inhibition and superconditioning in an environment Inui, T., Shimura, T., & Yamamoto, T. (2006). Effect of
with a distinctive shape. Journal of Experimental Psy- brain lesions on taste-potentiated odor aversion in
chology: Animal Behavior Processes, 36, 381–394. rats. Behavioral Neuroscience, 120, 590–599.
Horne, M. R., Gilroy, K. E., Cuell, S. F., & Pearce, J. M. Ison, J. R. (1962). Experimental extinction as a function
(2012). Latent spatial learning in an environment of number of reinforcements. Journal of Experimental
with a distinctive shape. Journal of Experimental Psy- Psychology, 64, 314–317.
chology: Animal Behavior Processes, 38, 139–147.
Howlett, R. J., & Majerus, M. E. N. (1987) The under- J
standing of industrial melanism in the peppered Jackson, R. L., Alexander, J. H., & Maier, S. F. (1980).
moth (Biston betularia) (Lepidoptera: Geometridae). Learned helplessness, inactivity, and associative
Biological Journal of the Linnean Society, 30, 31–44. deficits: Effects of inescapable shock on response
Huber, L. (2001). Visual categorization in pigeons. choice escape learning. Journal of Experimental Psy-
In R. G. Cook (Ed.), Avian visual cognition [Online]. chology: Animal Behavior Processes, 6, 1–20.
Available: www.pigeon.psy.tufts.edu/avc/huber/ Jackson, R. L., Maier, S. F., & Coon, D. J. (1979). Long-
Huber, L., & Lenz, R. (1993). A test of the linear term analgesic effects of inescapable shock and
feature model of polymorphous concept discrimina- learned helplessness. Science, 206, 91–94.
tion with pigeons. Quarterly Journal of Experimental Jacobs, W. J., Zellner, D. A., LoLordo, V. M., & Riley, A.
Psychology, 46B, 1–18. C. (1981). The effect of post-conditioning exposure
Huber, L., & Lenz, R. (1996). Categorization of proto- to morphine on the retention of a morphine-induced
type stimulus classes by pigeons. Quarterly Journal of conditional taste aversion. Pharmacology, Biochemis-
Experimental Psychology, 49B, 111–133. try and Behavior, 14, 779–785.
504  References

Jenkins, H. M. (1977). Sensitivity of different response columbiana). Journal of Experimental Psychology: Ani-
systems to stimulus-reinforcer and response- mal Behavior Processes, 26, 439–453.
reinforcer relations. In H. Davis & H. M. B. Hurwitz Kamin, L. J. (1957). The gradient of delay of secondary
(Eds.), Operant-Pavlovian interactions (pp. 47–62). reward in avoidance learning tested on avoidance
Hillsdale, NJ: Erlbaum. trials only. Journal of Comparative and Physiological
Jenkins, H. M. (1985). Conditioned inhibition of key Psychology, 50, 450–456.
pecking in the pigeon. In R. R. Miller & N. E. Spear Kamin, L. J. (1965). Temporal and intensity character-
(Eds.), Information processing in animals: Conditioned istics of the conditioned stimulus. In W. F. Prokasy
inhibition (pp. 327–354). Hillsdale, NJ: Lawrence (Ed.), Classical conditioning (pp. 118–147). New York:
Erlbaum Associates, Inc. Appleton-Century Crofts.
Jenkins, H. M., & Harrison, R. H. (1960). Effects of Kamin, L. J. (1968). “Attention-like” processes in clas-
discrimination training on auditory generalization. sical conditioning. In M. R. Jones (Ed.), Miami sympo-
Journal of Experimental Psychology, 59, 246–253. sium on the prediction of behavior: Aversive stimulation
Jenkins, H. M., & Harrison, R. H. (1962). Generaliza- (pp. 9–33). Miami, FL: University of Miami Press.
tion gradients of inhibition following auditory Kamin, L. J. (1969). Predictability, surprise, attention
discrimination learning. Journal of the Experimental and conditioning. In B. A. Campbell & R. M. Church
Analysis of Behavior, 5, 435–441. (Eds.), Punishment and aversive behavior (pp. 279–296).
Jenkins, H. M., & Moore, B. R. (1973). The form of the New York: Appleton-Century-Crofts.
auto-shaped response with food or water reinforc- Kamin, L. J., Brimer, C. J., & Black, A. H. (1963).
ers. Journal of the Experimental Analysis of Behavior, 20, Conditioned suppression as a monitor of fear of the
163–181. CS in the course of avoidance training. Journal of
Jenkins, H. M., Barnes, R. A., & Barrera, F. J. (1981). Comparative and Physiological Psychology, 56, 497–501.
Why autoshaping depends on trial spacing. In C. Kant, I. (1781). Critique of pure reason.
Locurto, H. S. Terrace, & J. Gibbon (Eds.), Autoshap- Kaplan, P. S. (1984). Importance of relative temporal
ing and conditioning theory (pp. 255–284). New York: parameters in trace autoshaping: From excitation to
Academic Press. inhibition. Journal of Experimental Psychology: Animal
Jitsumori, M. (1996). A prototype effect and categori- Behavior Processes, 10, 113–126.
zation of artificial polymorphous stimuli in pigeons. Kaplan, P. S., & Hearst, E. (1982). Bridging temporal
Journal of Experimental Psychology: Animal Behavior gaps between CS and US in autoshaping: Insertion
Processes, 22, 405–419. of other stimuli before, during, and after CS. Journal
Jitsumori, M., & Yoshihara, M. (1997). Categorical dis- of Experimental Psychology: Animal Behavior Processes,
crimination of human facial expressions by pigeons: 8, 187–203.
A test of the linear feature model. Quarterly Journal of Karpicke, J., Christoph, G., Peterson, G., & Hearst, E.
Experimental Psychology, 50B, 253–268. (1977). Signal location and positive versus nega-
Johnson, S. L., McPhee, L., & Birch, L. L. (1991). Con- tive conditioned suppression in the rat. Journal of
ditioned preferences: Young children prefer flavors Experimental Psychology: Animal Behavior Processes, 3,
associated with high dietary fat. Physiology and 105–118.
Behavior, 50, 1245–1251. Kasprow, W. J., Schachtman, T. R., & Miller, R. R.
Jones, J. E., Antoniadis, E., Shettleworth, S. J., & Ka- (1987). The comparator hypothesis of conditioned
mil, A. C. (2002). A comparative study of geometric response generation: Manifest conditioned excita-
rule learning by nutcrackers (Nucifraga columbiana), tion and inhibition as a function of relative excit-
pigeons (Columba livia), and jackdaws (Corvus atory strengths of CS and conditioning context at
moedula). Journal of Comparative Psychology, 116, the time of testing. Journal of Experimental Psychology:
350–356. Animal Behavior Processes, 13, 395–406.
Jordan, W. P., Strasser, H. C., & McHale, L. (2000). Katcher, A. H., Solomon, R. L., Turner, L. H., Lolordo,
Contextual control of long-term habituation in rats. V. M., Overmier, J. B., & Rescorla, R. A. (1969). Heart
Journal of Experimental Psychology: Animal Behavior rate and blood pressure responses to signaled and
Processes, 26, 323–339. unsignaled shocks: Effects of cardiac sympathecto-
K my. Journal of Comparative and Physiological Psychol-
Kaiser, D. H., Sherburne, L. M., Steirn, J. N., & Zentall, ogy, 68, 163–174.
T. R. (1997). Perceptual learning in pigeons: De- Katz, J. S., & Cook, R. G. (2000). Stimulus repetition
creased ability to discriminate samples mapped onto effects on texture-based visual search by pigeons.
the same comparison in many-to-one matching. Journal of Experimental Psychology: Animal Behavior
Psychonomic Bulletin & Review, 4, 378–381. Processes, 26, 220–236.
Kalat, J. W., & Rozin, P. (1973). “Learned safety” as a Katz, J. S., & Wright, A. A. (2006). Same/different
mechanism in long-delay taste-aversion learning in abstract-concept learning by pigeons. Journal of
rats. Journal of Comparative and Physiological Psychol- Experimental Psychology: Animal Behavior Processes,
ogy, 83, 198–207. 32, 80–86.
Kamil, A. C., & Jones, J. E. (2000). Geometric Kaye, H., & Pearce, J. M. (1984). The strength of the
rule learning by Clark’s nutcrackers (Nucifraga orienting response during Pavlovian conditioning.
Journal of Experimental Psychology: Animal Behavior
Processes, 10, 90–109.
References  505

Kehoe, E. J. (1988). A layered network model of as- Kim, J. J., & Fanselow, M. S. (1992). Modality-specific
sociative learning: Learning to learn and configura- retrograde amnesia of fear. Science, 256, 675–677.
tion. Psychological Review, 95, 411–433. Kindt, M., Soeter, M., & Vervliet, B. (2009). Beyond
Kehoe, E. J., & Gormezano, I. (1980). Configuration extinction: Erasing human fear responses and
and combination laws in conditioning with com- preventing the return of fear. Nature Neuroscience, 12,
pound stimuli. Psychological Bulletin, 87, 351–378. 256–258.
Kelleher, R. T. (1966). Conditioned reinforcement in King, D. A., Bouton, M. E., & Musty, R. E. (1987). As-
second-order schedules. Journal of the Experimental sociative control of tolerance to the sedative effects
Analysis of Behavior, 9, 475–485. of a short-acting benzodiazepine. Behavioral Neuro-
Kelly, D. M., & Spetch, M. L. (2001). Pigeons encode science, 101, 104–114.
relative geometry. Journal of Experimental Psychology: Kirby, K. N., Petry, N. M., & Bickel, W. K. (1999).
Animal Behavior Processes, 27, 417–422. Heroin addicts have higher discount rates for de-
Kelly, D. M., & Spetch, M. L. (2012). Comparative spa- layed rewards than non-drug-using controls. Journal
tial cognition: Encoding of geometric information of Experimental Psychology: General, 128, 78–87.
from surfaces and landmark arrays. In T. R. Zentall Kissinger, S. C., & Riccio, D. C. (1995). Stimulus condi-
& E. A. Wasserman (Eds.), The Oxford handbook of tions influencing the development of tolerance to
comparative cognition (pp. 366–389). New York: Ox- repeated cold exposure in rats. Animal Learning &
ford University Press. Behavior, 23, 9–16.
Kenward, B., Folke, S., Holmberg, J., Johansson, A., & Klein, S. B., Cosmides, L., Tooby, J., & Chance, S.
Gredebäck, G. (2009). Goal directedness and deci- (2002). Decisions and the evolution of memory:
sion making in infants. Developmental Psychology, 45, Multiple systems, multiple functions. Psychological
809–819. Review, 109, 306–329.
Kern, D. L., McPhee, L., Fisher, J., Johnson, S., & et al. Klossek, U. M. H., & Dickinson, A. (2012). Rational
(1993). The postingestive consequences of fat condi- action selection in 1½- to 3-year-olds following an
tion preferences for flavors associated with high extended training experience. Journal of Experimental
dietary fat. Physiology and Behavior, 54, 71–76. Child Psychology, 111, 197–211.
Kesner, R. P., & Despain, M. J. (1988). Correspondence Klossek, U. M. H., Russell, J., & Dickinson, A. (2008).
between rats and humans in the utilization of retro- The control of instrumental action following out-
spective and prospective codes. Animal Learning & come devaluation in young children aged between
Behavior, 16, 299–302. 1 and 4 years. Journal of Experimental Psychology:
Kessel, E. L. (1955). The mating activities of balloon General, 137, 39–51.
flies. Systematic Zoology, 4, 97–104. Konorski, J. (1967). Integrative activity of the brain: An
Kessler, D. A. (2009). The end of overeating: Taking interdisciplinary approach. Chicago, IL: University of
control of the insatiable American appetite. New York: Chicago Press.
Rodale. Koob, G. F., & Le Moal, M. (2008). Addiction and the
Kettlewell, H. B. D. (1956). Further selection experi- brain antireward system. Annual Review of Psychol-
ments on industrial melanism in the Lepidoptera. ogy, 59, 29–53.
Heredity, 10, 287–301. Korol, B., Sletten, I. W., & Brown, M. L. (1966). Con-
Kiernan, M. J., & Westbrook, R. F. (1993). Effects of ditioned physiological adaptation to anticholinergic
exposure to a to-be-shocked environment upon the drugs. American Journal of Physiology, 211, 911–914.
rat’s freezing response: Evidence for facilitation, Kosaki, Y., & Dickinson, A. (2010). Choice and
latent inhibition, and perceptual learning. Quarterly contingency in the development of behavioral
Journal of Experimental Psychology, 46B, 271–288. autonomy during instrumental conditioning. Journal
Killcross, S., & Balleine, B. (1996). Role of primary of Experimental Psychology: Animal Behavior Processes,
motivation in stimulus preexposure effects. Journal 36, 334–342.
of Experimental Psychology: Animal Behavior Processes, Kosaki, Y., Austen, J. M., & McGregor, A. (2013).
22, 32–42. Overshadowing of geometry learning by discrete
Killcross, S., & Coutureau, E. (2003). Coordination of landmarks in the water maze: Effects of relative
actions and habits in the medial prefrontal cortex of salience and relative validity of competing cues.
rats. Cerebral Cortex, 13, 400–408. Journal of Experimental Psychology: Animal Behavior
Killcross, S., & Dickinson, A. (1996). Contextual con- Processes, 39, 126–139.
trol of latent inhibition by the reinforcer. Quarterly Kraemer, P. J., & Roberts, W. A. (1984). The influence
Journal of Experimental Psychology, 49B, 45–59. of flavor preexposure and test interval on condi-
Killeen, P. R., & Fetterman, J. G. (1988). A behavioral tioned taste aversions in the rat. Learning and Motiva-
theory of timing. Psychological Review, 95. tion, 15, 259–278.
Killeen, P., & Pellón, R. (2013). Adjunctive behaviors Krank, M. D. (1989). Environmental signals for
are operants. Learning & Behavior, 41, 1–24. ethanol enhance free-choice ethanol consumption.
Kim, J. A., Siegel, S., & Patenall, V. R. A. (1999). Drug- Behavioral Neuroscience, 103, 365–372.
onset cues as signals: Intraadministration associa- Kremer, E. F. (1978). The Rescorla-Wagner model:
tions and tolerance. Journal of Experimental Psychol- Losses in associative strength in compound
ogy: Animal Behavior Processes, 25, 491–504.
506  References

conditioned stimuli. Journal of Experimental Psychol- Laurent, V., & Balleine, B. (2015). Factual and coun-
ogy: Animal Behavior Processes, 4, 22–36. terfactual action-outcome mappings control choice
Kruschke, J. K. (1992). ALCOVE: A connectionist mod- between goal-directed actions in rats. Current Biol-
el of human category learning. In R. P. Lippman, ogy, 25, 1074–1079.
J. E. Moody, & D. S. Touretzky (Eds.), Advances Lazareva, O. F., Wasserman, E. A., & Young, M. E.
in neural information processing systems (Vol. 3, pp. (2005). Transposition in pigeons: Reassessing Spence
649–655). San Mateo, CA: Morgan Kaufmann. (1937) with multiple discrimination training. Learn-
Kruse, J. M., Overmier, J. B., Konz, W. A., & Rokke, E. ing & Behavior, 33, 22–46.
(1983). Pavlovian conditioned stimulus effects upon Lazareva, O. F., Shimizu, T., & Wasserman, E. A.
instrumental choice behavior are reinforcer specific. (Eds.) (2012). How animals see the world: Comparative
Learning and Motivation, 14, 165–181. behavior, biology, and evolution of vision. New York:
Kuhn, T. (1962). The structure of scientific revolutions. Oxford University Press.
Chicago, IL: University of Chicago Press. Lázaro-Muñoz, G., LeDoux, J. E., & Cain, C. K. (2010).
Kutlu, M. G., & Schmajuk, N. A. (2012). Solving Pav- Sidman instrumental avoidance initially depends
lov’s puzzle: Attentional, associative, and flexible on lateral and basal amygdala and is constrained
configural mechanisms in classical conditioning. by central amygdala-mediated Pavlovian processes.
Learning & Behavior, 40, 269–291. Biological Psychiatry, 67, 1120–1127.
L Le Magnen, J. (1959). Etude d’un phenomene
Lamarre, J., & Holland, P. C. (1987). Transfer of inhibi- d’appetit provisionnel [Study of the phenomenon of
tion after serial feature negative discrimination provisional appetite]. Comptes Rendus de l’Academie
training. Learning and Motivation, 18, 319–342. des Sciences (Paris), 249, 2400–2402.
Lamb, R. J., Kirby, K. C., Morral, A. R., Galbicka, G., & Le Pelley, M. E. (2004). The role of associative history
Iguchi, M. Y. (2010). Shaping smoking cessation in in models of associative learning: A selective review
hard-to-treat smokers. Journal of Consulting and Clini- and a hybrid model. The Quarterly Journal of Experi-
cal Psychology, 78, 62–71. mental Psychology, 57B, 193–243.
Lamon, S., Wilson, G. T., & Leaf, R. C. (1977). Human Le Pelley, M. E. (2012). Metacognitive monkeys or
classical aversion conditioning: Nausea versus associative animals? Simple reinforcement learning
electric shock in the reduction of target beverage explains uncertainty in nonhuman animals. Journal
consumption. Behaviour Research and Therapy, 15, of Experimental Psychology: Learning, Memory, and
313–320. Cognition, 38, 686–708.
Lamoureux, J. A., Buhusi, C. V., & Schmajuk, N. A. Le Pelley, M. E. (2014). Primate polemic: Commentary
(1998). A real-time theory of Pavlovian condition- on Smith, Couchman, and Beran (2014). Journal of
ing: Simple stimuli and occasion setters. In N. A. Comparative Psychology, 128, 132–134.
Schmajuk & P. C. Holland (Eds.), Occasion setting: Le Pelley, M. E., Oakeshott, S. M., Wills, A. J., &
Associative learning and cognition in animals (pp. McLaren, I. P. L. (2005). The outcome specificity of
383–424). Washington, DC: American Psychological learned predictiveness effects: Parallels between
Association. human causal learning and animal conditioning.
Lang, P. J. (1995). The emotion probe: Studies of Journal of Experimental Psychology: Animal Behavior
motivation and attention. American Psychologist, 50, Processes, 31, 226–236.
372–385. Lea, S. E. G. (1984). In what sense do pigeons learn
Langley, C. M. (1996). Search images: Selective at- concepts? In H. L. Roitblat, T. G. Bever, & H. S. Ter-
tention to specific visual features of prey. Journal of race (Eds.), Animal cognition (pp. 263–276). Hillsdale,
Experimental Psychology: Animal Behavior Processes, NJ: Erlbaum.
22, 152–163. Lea, S. E. G., & Ryan, C. M. E. (1990). Unnatural
Lattal, K. M. (1999). Trial and intertrial durations in concepts and the theory of concept discrimination
Pavlovian conditioning: Issues of learning and per- in birds. In M. L. Commons, R. J. Herrnstein, S. M.
formance. Journal of Experimental Psychology: Animal Kosslyn, & D. B. Mumford (Eds.), Quantitative analy-
Behavior Processes, 25, 433–450. ses of behavior (Vol. 8): Behavioral approaches to pattern
Lattal, K. M. (2007). Effects of ethanol on the encod- recognition and concept formation (pp. 165–185). Hills-
ing, consolidation, and expression of extinction dale, NJ: Erlbaum.
following contextual fear conditioning. Behavioral Lea, S. E., & Roper, T. J. (1977). Demand for food on
Neuroscience, 121, 1280–1292. fixed-ratio schedules as a function of the quality of
Lattal, K. M. & Abel, T. (2004). Behavioral impair- concurrently available reinforcement. Journal of the
ments caused by injections of the protein synthesis Experimental Analysis of Behavior, 27, 371–380.
inhibitor anisomycin after contextual retrieval Leak, T. M., & Gibbon, J. (1995). Simultaneous timing
reverse with time. Proceedings From the National of multiple intervals: Implications of the scalar
Academy of Sciences, 101, 4667–4672. property. Journal of Experimental Psychology: Animal
Lattal, K. M., & Nakajima, S. (1998). Overexpectation Behavior Processes, 21, 3–19.
in appetitive Pavlovian and instrumental condition- Leaton, R. N. (1974). Long-term retention of the ha-
ing. Animal Learning & Behavior, 26, 351–360. bituation of lick suppression in rats. Journal of Com-
parative and Physiological Psychology, 87, 1157–1164.
References  507

Leclerc, R., & Reberg, D. (1980). Sign-tracking in Levis, D. J., & Brewer, K. E. (2001). The neurotic
aversive conditioning. Learning and Motivation, 11, paradox: Attempts by two-factor fear theory and
302–317. alternative avoidance models to resolve the issues
LeBlanc, K. H., Ostlund, S. B., & Maidment, N. T. associated with sustained avoidance responding
(2012). Pavlovian-to-instrumental transfer in cocaine in extinction. In R. R. Mowrer & S. B. Klein (Eds.),
seeking rats. Behavioral Neuroscience, 126, 681–689. Handbook of comtemporoary learning theories (pp.
LeDoux, J. (2015). Anxious: Using the brain to under- 561–597). Mahwah, NJ: Erlbaum.
stand and treat fear and anxiety. New York: Viking. Levitsky, D., & Collier, G. (1968). Schedule-induced
LeDoux, J. E. (2012). Rethinking the emotional brain. wheel running. Physiology and Behavior, 3, 571–573.
Neuron, 73, 653–676. Lieberman, D. A., McIntosh, D. C., & Thomas, G. V.
Lee, R. K. K. & Maier, S. F. (1988). Inescapable shock (1979). Learning when reward is delayed: A mark-
and attention to internal versus external cues in ing hypothesis. Journal of Experimental Psychology:
a water escape discrimination task. Journal of Ex- Animal Behavior Processes, 5, 224–242.
perimental Psychology: Animal Behavior Processes, 14, Linden, D. R., Savage, L. M., & Overmier, J. B. (1997).
302–311. General learned irrelevance: A Pavlovian analog to
Lee, J. L. C., Milton, A. L., & Everitt, B. J. (2006). learned helplessness. Learning and Motivation, 28,
Reconsolidation and extinction of conditioned fear: 230–247.
Inhibition and potentiation. Journal of Neuroscience, Lipsitt, L. P. (1990). Learning and memory in infants.
26, 10051–10056. Merrill Palmer Quarterly, 36, 53–66.
Leitenberg, H., Gross, J., Peterson, J., & Rosen, J. C. Locke, J. (1690). An essay concerning human
(1984). Analysis of an anxiety model and the process understanding.
of change during exposure plus response prevention Locurto, C., Terrace, H. S., & Gibbon, J. (Eds.). (1981).
treatment of bulimia nervosa. Behavior Therapy, 15, Autoshaping and conditioning theory. New York:
3–20. Academic Press.
Lejeune, H., Cornet, S., Ferreira, M. A., & Wearden, Loftus, E. F. (1979). The malleability of human
J. H. (1998). How do Mongolian gerbils (Meriones memory. American Scientist, 67, 312–320.
unguiculatus) pass the time? Adjunctive behavior Logue, A. W. (1985). Conditioned food aversion learn-
during temporal differentiation in gerbils. Journal of ing in humans. Annals of the New York Academy of
Experimental Psychology: Animal Behavior Processes, Sciences, 443, 316–329.
24, 352–368. Logue, A. W. (1988). Research on self-control: An
Leonard, D. W. (1969). Amount and sequence of integrating framework. Behavioral and Brain Sciences,
reward in partial and continuous reinforcement. 11, 665–709.
Journal of Comparative and Physiological Psychology, 67, Logue, A. W. (1995). Self-control. Englewood Cliffs, NJ:
204–211. Prentice Hall.
Lepper, M. R., Greene, D., & Nisbett, R. E. (1973). Logue, A. W. (1998). Self-control. In W. T. O’Donohue
Undermining children’s intrinsic interest with (Ed.), Learning and behavior therapy (pp. 252–273).
extrinsic reward: A test of the “overjustification” Needham Heights, MA: Allyn and Bacon.
hypothesis. Journal of Personality and Social Psychol- Loidolt, M., Aust, U., Meran, I., & Huber, L. (2003).
ogy, 28, 129–137. Pigeons use item-specific and category-level
Lester, L. S., & Fanselow, M. S. (1985). Exposure to information in the identification and categorization
a cat produces opioid analgesia in rats. Behavioral of human faces. Journal of Experimental Psychology:
Neuroscience, 99, 756–759. Animal Behavior Processes, 29, 261–276.
Lett, B. T. (1973). Delayed reward learning: Disproof LoLordo, V. M., & Fairless, J. L. (1985). Pavlovian
of the traditional theory. Learning and Motivation, 4, conditioned inhibition: The literature since 1969. In
237–246. R. R. Miller & N. E. Spear (Eds.), Information process-
Lett, B. T. (1977). Long delay learning in the T-maze: ing in animals: Conditioned inhibition. Hillsdale, NJ:
Effect of reward given in the home cage. Bulletin of Lawrence Erlbaum Associates, Inc.
the Psychonomic Society, 10, 211–214. LoLordo, V. M., & Taylor, T. L. (2001). Effects of
Lett, B. T. (1984). Extinction of taste aversion does uncontrollable aversive events: Some unsolved
not eliminate taste potentiation of odor aversion in puzzles. In R. R. Mowrer & S. B. Klein (Eds.), Hand-
rats or color aversion in pigeons. Animal Learning & book of contemporary learning theories (pp. 469–504).
Behavior, 12, 414–420. Mahwah, NJ: Erlbaum.
Leung, H. T., & Westbrook, R. F. (2010). Increased Loo, S. K., & Bitterman, M. E. (1992). Learning in
spontaneous recovery with increases in conditioned honeybees (Apis mellifera) as a function of sucrose
stimulus alone exposures. Journal of Experimental concentration. Journal of Comparative Psychology, 106,
Psychology: Animal Behavior Processes, 36, 354–367. 29–36.
Leung, H. T., Reeks, L. M., & Westbrook, R. F. (2012). Lopez, M., Balleine, B., & Dickinson, A. (1992).
Two ways to deepen extinction and the difference Incentive learning and the motivational control of
between them. Journal of Experimental Psychology: instrumental performance by thirst. Animal Learning
Animal Behavior Processes, 38, 394–406. & Behavior, 20, 322–328.
508  References

Lorenz, K. (1937). The companion in the bird’s world. Mackintosh, N. J. (1973). Stimulus selection: Learning
Auk, 54, 245–273. to ignore stimuli that predict no change in reinforce-
Lovaas, O. I. (1967). A behavior therapy approach to ment. In R. A. Hinde & J. S. Hinde (Eds.), Constraints
the treatment of childhood schizophrenia. In J. P. on learning (pp. 75–96). London: Academic Press.
Hill (Ed.), Minnesota symposium on child psychology. Mackintosh, N. J. (1974). The psychology of animal learn-
Minneapolis: University of Minnesota Press. ing. London: Academic Press.
Lovibond, P. F. (1983). Facilitation of instrumental Mackintosh, N. J. (1975a). A theory of attention: Varia-
behavior by a Pavlovian appetitive conditioned tions in the associability of stimuli with reinforce-
stimulus. Journal of Experimental Psychology: Animal ment. Psychological Review, 82, 276–298.
Behavior Processes, 9, 225–247. Mackintosh, N. J. (1975b). Blocking of conditioned
Lovibond, P. F. (2003). Causal beliefs and conditioned suppression: Role of the first compound trial. Journal
responses: Retrospective revaluation induced by of Experimental Psychology: Animal Behavior Processes,
experience and by instruction. Journal of Experimen- 1, 335–345.
tal Psychology: Learning, Memory, and Cognition, 29, Mackintosh, N.J. (1976). Overshadowing and stimulus
97–106. intensity. Animal Learning & Behavior, 4, 186–192.
Lovibond, P. F. (2006). Fear and avoidance: An Mackintosh, N. J. (1978). Cognitive or associative
integrated expectancy model.In M.G. Craske, D. theories of conditioning: Implications of an analysis
Hermans, & D. Vansteenwegen (Eds.), Fear and learn- of blocking. In S. Hulse & H. Fowler, & W. K. Honig
ing: From basic processes to clinical implications (pp. (Eds.), Cognitive processes in animal behavior (pp.
117–132). Washington, DC: American Psychological 155–176). Hillsdale, NJ: Lawrence Erlbaum Associ-
Association. ates, Inc.
Lovibond, P. F., & Colagiuri, B. (2013). Facilitation Mackintosh, N. J. (1983). Conditioning and associative
of voluntary goal-directed action by reward cues. learning. New York: Clarendon Press.
Psychological Science, 24, 2030–2037. Mackintosh, N. J. (1995). Categorization by people
Lovibond, P. F., & Shanks, D. R. (2002). The role of and pigeons: The twenty-second Bartlett Memorial
awareness in Pavlovian conditioning: Empirical lecture. Quarterly Journal of Experimental Psychology,
evidence and theoretical implications. Journal of 48, 193–214.
Experimental Psychology: Animal Behavior Processes, Mackintosh, N. J., & Dickinson, A. (1979). Instrumen-
28, 3–26. tal (Type II) conditioning. In A. Dickinson & R. A.
Lovibond, P. F., Been, S., Mitchell, C. J., Bouton, M. E., Boakes (Eds.), Mechanisms of learning and motivation:
& Frohardt, R. (2003). Forward and backward block- A memorial volume to Jerzy Konorski (pp. 143–169).
ing of causal judgment is enhanced by additivity of Hillsdale, NJ: Erlbaum.
effect magnitude. Memory and Cognition, 31, 133–142. Mackintosh, N. J., & Little, L. (1969). Intradimensional
Lovibond, P. F., Chen, S. X., Mitchell, C. J., & Weide- and extradimensional shift learning by pigeons.
mann, G. (2013). Competition between an avoidance Psychonomic Science, 14, 5–6.
response and a safety signal: Evidence for a single Mackintosh, N. J., & Turner, C. (1971). Blocking as a
learning system. Biological Psychology, 92, 9–16. function of novelty of CS and predictability of UCS.
Lovibond, P. F., Davis, N. R., & O’Flaherty, A. S. Quarterly Journal of Experimental Psychology A, 23,
(2000). Protection from extinction in human fear 359–366.
conditioning. Behaviour Research and Therapy, 38, Mackintosh, N. J., Kaye, H., & Bennett, C. H. (1991).
967–983. Perceptual learning in flavor aversion conditioning.
Lovibond, P. F., Preston, G. C., & Mackintosh, N. J. Quarterly Journal of Experimental Psychology, 43B,
(1984). Context specificity of conditioning, extinc- 297–322.
tion, and latent inhibition. Journal of Experimental MacLennan, A. J., Drugan, R. C., Hyson, R. L., Maier,
Psychology: Animal Behavior Processes, 10, 360–375. S. F., Madden, J., IV, & Barchas, J. D. (1982). Dis-
Lubow, R. E. (1973). Latent inhibition. Psychological sociation of long-term analgesia and the shuttlebox
Bulletin, 79, 398–407. escape deficit caused by inescapable shock. Journal of
Lucas, G. A., Deich, J. D., & Wasserman, E. A. (1981). Comparative and Physiological Psychology, 96, 904–913.
Trace autoshaping: Acquisition, maintenance, and Mactutus, C. F., Riccio, D. C., & Ferek, J. M. (1979). Ret-
path dependence at long trace intervals. Journal of rograde amnesia for old (reactivated) memory: Some
the Experimental Analysis of Behavior, 36, 61–74. anomalous characteristics. Science, 204, 1319–1320.
Lucas, G. A., Timberlake, W., & Gawley, D. J. (1988). Madden, G. J., Bickel, W. K., & Jacobs, E. A. (1999).
Adjunctive behavior in the rat under periodic food Discounting of delayed rewards in opioid-de-
delivery in a 24-hr environment. Animal Learning & pendent outpatients: Exponential or hyperbolic
Behavior, 16, 19–30. discounting functions? Experimental and Clinical
M Psychopharmacology, 7, 284–293.
Macfarlane, D. A. (1930). The role of kinesthesis in Mahoney, W. J., & Ayres, J. J. (1976). One-trial simulta-
maze learning. University of California Publications in neous and backward fear conditioning as reflected
Psychology, 4, 277–305. in conditioned suppression of licking in rats. Animal
Machado, A. (1997). Learning the temporal dynamics Learning & Behavior, 4, 357–362.
of behavior. Psychological Review, 104, 241–265.
References  509

Maier, S. F. (1970). Failure to escape traumatic electric of oscillatory processes. Cognitive Brain Research, 21,
shock: Incompatible skeletal motor response or 139–170.
learned helplessness? Learning and Motivation, 1, Matthews, R. N., Domjan, M., Ramsey, M., & Crews,
157–169. D. (2007). Learning effects on sperm competition
Maier, S. F., & Seligman. M. E. P. (1976). Learned help- and reproductive fitness. Psychological Science, 18,
lessness: Theory and evidence. Journal of Experimen- 758–762.
tal Psychology: General, 103, 3–46. Matute, H., Arcediano, F., & Miller, R. R. (1996). Test
Maier, S. F., & Watkins, L. R. (1998). Stressor control- question modulates cue competition between causes
lability, anxiety, and serotonin. Cognitive Therapy and and between effects. Journal of Experimental Psychol-
Research, 22, 595–613. ogy: Learning, Memory, and Cognition, 22, 182–196.
Maier, S. F., & Watkins, L. R. (2005). Stressor control- Matzel, L. D., Brown, A. M., & Miller, R. R. (1987).
lability and learned helplessness: The roles of the Associative effects of US preexposure: Modulation
dorsal raphe nucleus, serotonin, and corticotropin- of conditioned responding by an excitatory training
releasing factor. Neuroscience and Biobehavioral context. Journal of Experimental Psychology: Animal
Reviews, 29, 829–841. Behavior Processes, 13, 65–72.
Maier, S. F., & Watkins, L. R. (2010). Role of the medial Matzel, L. D., Held, F. P., & Miller, R. R. (1988).
prefrontal cortex in coping and resilience. Brain Information and expression of simultaneous and
Research, 1355, 52–60. backward associations: Implications for contiguity
Maier, S. F., Jackson, R. L., & Tomie, A. (1987). Po- theory. Learning and Motivation, 19, 317–344.
tentiation, overshadowing, and prior exposure to Mazur, J. E. (1993). Predicting the strength of a condi-
inescapable shock. Journal of Experimental Psychology: tioned reinforcer: Effects of delay and uncertainty.
Animal Behavior Processes, 13, 226–238. Current Directions in Psychological Science, 2, 70–74.
Maier, S. F., Rapaport, P., & Wheatly, K. L. (1976). Con- Mazur, J. E. (1997). Choice, delay, probability, and con-
ditioned inhibition and the CS-US interval. Animal ditioned reinforcement. Animal Learning & Behavior,
Learning & Behavior, 4, 217–220. 25, 131–147.
Maier, S. F., Seligman, M. E. P., & Solomon, R. L. Mazur, J. E., & Wagner, A. R. (1982). An episodic
(1969). Pavlovian fear conditioning and learned model of associative learning. In M. Commons,
helplessness. In B. A. Campbell & R. M. Church R. Herrnstein, & A. R. Wagner (Eds.), Quantitative
(Eds.), Punishment (pp. 299–342). New York: analyses of behavior: Acquisition (Vol. 3, pp. 3–39).
Appleton. Cambridge, MA: Ballinger.
Maier, S. F., Sherman, J. E., Lewis, J. W., Terman, G. McAllister, W. R., & McAllister, D. E. (1971). Behav-
W., & Liebeskind, J. C. (1983). The opioid/nonopioid ioral measurement of conditioned fear. In F. R. Brush
nature of stress-induced analgesia and learned help- (Ed.), Aversive conditioning and learning. New York:
lessness. Journal of Experimental Psychology: Animal Academic Press.
Behavior Processes, 9, 80–90. McAllister, W. R., & McAllister, D. E. (1991). Fear
Manns, J. R., Clark, R. E., & Squire, L. R. (2002). theory and aversively motivated behavior: Some
Standard delay eyeblink classical conditioning is controversial issues. In M. R. Denny (Ed.), Fear,
independent of awareness. Journal of Experimental avoidance, and phobias: A fundamental analysis (pp.
Psychology: Animal Behavior Processes, 28, 32–37. 135–163). Hillsdale, NK: Erlbaum.
Mansfield, J. G., & Cunningham, C. L. (1980). Condi- McAndrew, A. Jones, F. W., McLaren, R. P., &
tioning and extinction of tolerance to the hypother- McLaren, I. P. L. (2012). Dissociating expectancy of
mic effect of ethanol in rats. Journal of Comparative shock and changes in skin conductance: An investi-
and Physiological Psychology, 94, 962–969. gation of the Perruchet effect using an electrodermal
Marchant, N. J., Li, X., & Shaham, Y. (2013). Recent paradigm. Journal of Experimental Psychology: Animal
developments in animal models of drug relapse. Behavior Processes, 38, 203–208.
Current Opinion in Neurobiology, 23, 675–683. McClelland, J. L., & Rumelhart, D. E. (1985). Distrib-
Maren, S., & Fanselow, M. S. (1998). Appetitive uted memory and the representation of general and
motivational states differ in their ability to augment specific information. Journal of Experimental Psychol-
aversive fear conditioning in rats (Rattus norvegicus). ogy: General, 114, 159–188.
Journal of Experimental Psychology: Animal Behavior McCollough, C. (1965). Color adaptation of edge-
Processes, 24, 369–373. detectors in the human visual system. Science, 149,
Marks, I. M. (1978). Living with fear: Understanding and 1115–1116.
coping with anxiety. New York: McGraw-Hill. McConnell, B. L., & Miller, R. R. (2010). Protection
Marlin, N. A. (1981). Contextual associations in trace from extinction provided by a conditioned inhibitor.
conditioning. Animal Learning & Behavior, 9, 519–523. Learning & Behavior, 38, 68–79.
Marlin, N. A., & Miller, R. R. (1981). Associations to McCuller, T., Wong, P. T., & Amsel, A. (1976). Transfer of
contextual stimuli as a determinant of long-term ha- persistence from fixed-ratio barpress training to run-
bituation. Journal of Experimental Psychology: Animal way extinction. Animal Learning & Behavior, 4, 53–57.
Behavior Processes, 7, 313–333. [2, 4] McDowell, J. J. (2004). A computational model of
Matell, M. S., & Meck, W. H. (2004). Cortico-striatal selection by consequences. Journal of the Experimental
circuits and interval timing: Coincidence detection Analysis of Behavior, 81, 297–317.
510  References

McDowell, J. J. (2005). On the classic and modern Meehl, P. E. (1950). On the circularity of the law of
theories of matching. Journal of the Experimental effect. Psychological Bulletin, 47, 52–75.
Analysis of Behavior, 84, 111–127. Mehiel, R., & Bolles, R. C. (1984). Learned flavor pref-
McDowell, J. J. (2013). On the theoretical and empiri- erences based on caloric outcome. Animal Learning &
cal status of the matching law and matching theory. Behavior, 12, 421–427.
Psychological Review, 139, 1000–1028. Melchers, K. G., Shanks, D. R., & Lachnit, H. (2008).
McFarland, D. (1993). Animal behaviour: Psychobiology, Stimulus coding in human associative learning:
ethology, and evolution (2nd ed.). Menlo Park, CA: Flexible representations of parts and wholes. Behav-
Benjamin Cummings. ioural Processes, 77, 413–427.
McGaugh, J. L. (2000). Memory—A century of consoli- Melchior, C. L. (1990). Conditioned tolerance provides
dation. Science, 287, 248–251. protection against ethanol lethality. Pharmacology,
McGaugh, J. L. (2004). The amygdala modulates the Biochemistry, and Behavior, 37, 205–206.
consolidation of memories of emotionally arousing Mensink, G.-J. M., & Raaijmakers, J. G. (1988). A
experiences. Annual Review of Neuroscience, 27, 1–28. model for interference and forgetting. Psychological
McGeoch, J. A. (1932). Forgetting and the law of dis- Review, 95, 434–455.
use. Psychological Review, 39, 352–370. Mensink, G.-J. M., & Raaijmakers, J. G. (1989). A
McLaren, I. P. L., & Mackintosh, N. J. (2000). An model for contextual fluctuation. Journal of Math-
elemental model of associative learning: I. Latent ematical Psychology, 33, 172–186.
inhibition and perceptual learning. Animal Learning Menzel, R., Greggers, U., & Hammer, M. (1993).
& Behavior, 28, 211–246. Functional organization of appetitive learning and
McLaren, I. P. L., & Mackintosh, N. J. (2002). Asso- memory in a generalist pollinator, the Honey Bee. In
ciative learning and elemental representation: II. D. Papaj & A. C. Lewis (Eds.), Insect learning: Ecologi-
Generalization and discrimination. Animal Learning cal and evolutionary perspectives (pp. 79–125). New
& Behavior, 30, 177–200. York: Chapman & Hall.
McLaren, I. P. L., Forrest, C. L. D., McLaren, R. P., Mikulka, P. J., Pitts, E., & Philput, C. (1982). Overshad-
Jones, F. W., Aitken, M. R. F., & Mackintosh, N. J. owing not potentiation in taste aversion condition-
(2014). Associations and propositions: The case for a ing. Bulletin of the Psychonomic Society, 20, 101–104.
dual-process account of learning in humans. Neuro- Miller, N. E. (1948). Studies of fear as an acquirable
biology of Learning and Memory, 108, 185–195. drive: I. Fear as motivation and fear-reduction as
McLaren, I. P. L., Kaye, H., & Mackintosh, N. J. reinforcement in the learning of new responses.
(1989). An associative theory of the representation Journal of Experimental Psychology, 38, 89–101.
of stimuli: Applications to perceptual learning and Miller, N. E. (1957). Experiments on motivation. Sci-
latent inhibition. In R. G. M. Morris (Ed.), Paral- ence, 126, 1271–1278.
lel distributed processing: Implications for psychology Miller, N. E. (1959). Liberalization of basic S-R con-
and neurobiology (pp. 102–130). New York: Oxford cepts: Extensions to conflict behavior, motivation
University Press. and social learning. In S. Koch (Ed.), Psychology:
McLaren, I. P. L., Kaye, H., & Mackintosh, N. J. A study of science, Vol. 2 (pp. 196–292). New York:
(1989). An associative theory of the representation McGraw-Hill.
of stimuli: Applications to perceptual learning and Miller, N. E. (1985). The value of behavioral research
latent inhibition. In R. G. M. Morris (Ed.), Parallel on animals. American Psychologist, 40, 423–440
distributed processing (pp. 102–130). Oxford: Claren- Miller, N. Y., & Shettleworth, S. J. (2007). Learning
don Press. about enviornmental geometry: An associative
McNally, R. J. (1990). Psychological approaches to model. Journal of Experimental Psychology: Animal
panic disorder: A review. Psychological Bulletin, 108, Behavior Processes, 33, 191–212.
403–419. Miller, N. Y., & Shettleworth, S. J. (2008). An associa-
McNally, R. J. (1994). Panic disorder: A critical analysis. tive model of geometry learning: A modified choice
New York: The Guilford Press. rule. Journal of Experimental Psychology: Animal
McNish, K. A., Betts, S. L., Brandon, S. E., & Wagner, Behavior Processes, 34, 419–422.
A. R. (1997). Divergence of conditioned eyeblink and Miller, R. M., Kasprow, W. J., & Schachtman, T. R.
conditioned fear in backward Pavlovian training. (1986). Retrieval variability: Sources and conse-
Animal Learning & Behavior, 25, 43–52. quences. American Journal of Psychology, 99, 145–218.
Meck, W. H. (1983). Selective adjustment of the speed Miller, R. R., & Matute, H. (1996). Biological signifi-
of internal clock and memory processes. Journal of cance in forward and backward blocking: Resolu-
Experimental Psychology: Animal Behavior Processes, 9, tion of a discrepancy between animal conditioning
320–334. and human causal judgment. Journal of Experimental
Meck, W. H. (1996). Neuropharmacology of timing Psychology: General, 125, 370–386.
and time perception. Cognitive Brain Research, 3, Miller, R. R., & Matzel, L. D. (2006). Retrieval failure
227–242. versus memory loss in experimental amnesia:
Medin, D. L., & Schaffer, M. M. (1978). A context Definitions and processes. Learning & Memory, 13,
theory of classification learning. Psychological Review, 491–497.
85, 217–238. Miller, R. R., & Schachtman, T. R. (1985). The several
roles of context at the time of retrieval. In P. D.
References  511

Balsam & A. Tomie (Eds.), Context and learning. Minor, T. R., Trauner, M. A., Lee, C., & Dess, N. K.
Hillsdale, NJ: Cognitive or associative theories of (1990). Modeling signal features of escape response:
conditioning. Effects of cessation conditioning in the “learned
Miller, R. R., & Witnauer, J. E. (2015). Retrospective helplessness” paradigm. Journal of Experimental Psy-
revaluation: The phenomenon and its theoretical chology: Animal Behavior Processes, 16, 123–136.
implications. Behavioural Processes, in press. Misanin, J. R., Miller, R. R., & Lewis, D. J. (1968).
Miller, R. R., Barnet, R. C., & Grahame, N. J. (1995). Retrograde amnesia produced by electroconvulsive
Assessment of the Rescorla-Wagner model. Psycho- shock after reactivation of a consolidated memory
logical Bulletin, 117, 363–386. trace. Science, 160, 554–555.
Miller, R. R., Hallam, S. C., & Grahame, N. J. (1990). Mischel, W., Shoda, Y., & Rodriguez, M. (1989). Delay
Inflation of comparator stimuli following CS train- of gratification in children. Science, 244, 933–938.
ing. Animal Learning & Behavior, 18, 434–443. Mitchell, C., & Hall, G. (2014). Can theories of animal
Milton, A. L., & Everitt, B. J. (2010). The psychological discrimination explain perceptual learning in hu-
and neurochemical mechanisms of drug memory mans? Psychological Bulletin, 140, 283–307.
reconsolidation: Implications for the treatment Mitchell, C. J., De Houwer, J., & Lovibond, P. F. (2009).
of addiction. European Journal of Neuroscience, 31, The propositional nature of human associative
2308–2319. learning. Brain and Behavioral Sciences, 32, 183–246.
Milton, A. L., & Everitt, B. J. (2012). The persistence of Mitchell, M. R., Weiss, V. G., Ouimet, D. J., Fuchs, R.
maladaptive memory: Addiction, drug memories, A., Morgan, D., & Setlow B. (2014). Intake-depen-
and anti-relapse treatments. Neuroscience and Biobe- dent effets of cocaine self-administration on impul-
havioral Reviews, 36, 1119–1139. sive choice in a delay discounting task. Behavioral
Mineka, S. (1985). Animal models of anxiety-based Neuroscience, 128, 419–429.
disorders: Their usefulness and limitations. In A. H. Mobbs, D., Marchant, J. L., Hassabis, D., Seymour, B.
Tuma & J. D. Maser (Eds.), Anxiety and the anxiety Tan, G., Gray, M., Petrovic, P., Dolan, R. J., & Frith,
disorders (pp. 199–244). Hillsdale, NJ: Lawrence C. D. (2009). From threat to fear: The neural organi-
Erlbaum Associates, Inc. zation of defensive fear systems in humans. Journal
Mineka, S. (1992). Evolutionary memories, emotional of Neuroscience, 29, 12236–12243.
processing, and the emotional disorders. In D. Mobbs, D., Petrovic, P., Marchant, J. L., Hassabis, D.,
Medin (Ed.), The psychology of learning and motivation. Weiskopf, N., Seymour, B., Dolan, R. J., & Frith, C.
Vol. 28 (pp. 161–206). New York: Academic Press. D. (2007). When fear is near: Threat imminence elic-
Mineka, S., & Gino, A. (1980). Dissociation between its prefrontal-periaqueductal gray shifts in humans.
conditioned emotional response and extended Science, 317, 1079–1083.
avoidance performance. Learning and Motivation, 11, Moffitt, T. E., Arseneault, L., Belsky, D., Dickson, N.,
476–502. Hancox, R. J., Harrington, H., Houts, R., Poulton, R.
Mineka, S., & Henderson, R. W. (1985). Controllability Roberts, B. W., Ross, S., Sears, M. R., Thomson, W.
and predictability in acquired motivation. Annual M., & Caspi, A. (2011). A gradient of childhood self-
Review of Psychology, 36, 495–529. control predicts health, wealth, and public safety.
Mineka, S., & Zinbarg, R. (2006). A contemporary Proceedings of the National Academy of Sciences, 108,
learning theory perspective on the etiology of 2693–2698.
anxiety disorders: It’s not what you thought it was. Moore, B. R. (1973). The role of directed Pavlovian
American Psychologist, 61, 10–26. reactions in simple instrumental learning in the
Mineka, S., Cook, M., & Miller, S. (1984). Fear con- pigeon. In R. A. Hinde & J. Stevenson-Hinde (Eds.),
ditioned with escapable and inescapable shock: Constraints on learning (pp. 159–188). New York:
Effects of a feedback stimulus. Journal of Experimental Academic Press.
Psychology: Animal Behavior Processes, 10, 307–323. Moore, B. R., & Stuttard, S. (1979). Dr. Guthrie and
Mineka, S., Gunnar, M., & Champoux, M. (1986). Felis domesticus: Or, tripping over the cat. Science,
Control and early socioemotional development: 205, 1031–1033.
Infant rhesus monkeys reared in controllable versus Moore, J. W., Newman, F. L., & Glasgow, B. (1969).
uncontrollable environments. Child Development, 57, Intertrial cues as discriminative stimuli in human
1241–1256. eyelid conditioning. Journal of Experimental Psychol-
Minor, T. R., Dess, N. K., & Overmier, J. B. (1991). ogy, 79, 319–326.
Inverting the traditional view of “learned helpless- Morgan, C. L. (1890). Animal life and intelligence. Lon-
ness.” In M. R. Denny (Ed.), Fear, avoidance, and pho- don: Edward Arnold.
bias: A fundamental analysis (pp. 87–134). Hillsdale, Morgan, C. L. (1894). An introduction to comparative
NJ: Lawrence Erlbaum Associates. psychology. London: Walter Scott.
Minor, T. R., Jackson, R. L., & Maier, S. F. (1984). Ef- Morris, R. G. M. (1974). Pavlovian conditioned inhibi-
fects of task irrelevant cues and reinforcement delay tion of fear during shuttlebox avoidance behavior.
on choice escape learning following inescapable Learning and Motivation, 5, 424–447.
shock: Evidence for a deficit. Journal of Experimental Morris, R. G. M. (1981). Spatial localization does not
Psychology: Animal Behavior Processes, 10, 543–556. require the presence of local cues. Learning and Moti-
vation, 12, 239–260.
512  References

Morris, R. G., Garrud, P, Rawlins, J. N., & J. O’Keefe. Nader, K., Schafe, G. E., & LeDoux, J. E. (2000). Fear
(1982). Place navigation impaired in rats with hip- memories require protein synthesis in the amyg-
pocampal lesions. Nature, 297, 681–683. dala for reconsolidation after retrieval. Nature, 406,
Morris, R. W., & Bouton, M. E. (2006). Effect of un- 722–726.
conditioned stimulus magnitude on the emergence Nakajima, S., Tanaka, S., Urushihara, K., & Imada,
of conditioned responding. Journal of Experimental H. (2000). Renewal of extinguished lever-press re-
Psychology: Animal Behavior Processes, in press. sponses upon return to the training context. Learning
Morrison, S. D. (1976). Control of food intake in can- and Motivation, 31, 416–431.
cer cachexia: A challenge and a tool. Physiology and Napier, R. M., Macrae, M., & Kehoe, E. J. (1992). Rapid
Behavior, 17, 705–714. reacquisition in conditioning of the rabbit’s nictitat-
Moscarello, J. M., & LeDoux, J. E. (2013). Active avoid- ing membrane response. Journal of Experimental
ance learning requires prefrontal suppression of Psychology: Animal Behavior Processes, 18, 182–192.
amygdala-mediated defensive reactions. Journal of Nash, S., & Domjan, M. (1991). Learning to discrimi-
Neuroscience, 33, 3815–3823. nate the sex of conspecifics in male Japanese quail
Moscovitch, A., & LoLordo, V. M. (1968). Role of (Coturnix coturnix japonica): Tests of “biological con-
safety in the Pavlovian backward fear conditioning straints.” Journal of Experimental Psychology: Animal
procedure. Journal of Comparative and Physiological Behavior Processes, 17, 342–353.
Psychology, 66, 673–678. Nation, J. R., Cooney, J. B., & Gartrell, K. E. (1979). Du-
Mowrer, O. H. (1939). A stimulus-response analysis of rability and generalizability of persistence training.
anxiety and its role as a reinforcing agent. Psycho- Journal of Abnormal Psychology, 88, 121–136.
logical Review, 46, 553–565. Nelson, J. B. (2002). Context specificity of excitation
Mowrer, O. H. (1947). On the dual nature of learning: and inhibition in ambiguous stimuli. Learning and
A reinterpretation of “conditioning” and “problem- Motivation, 33, 284–310.
solving.” Harvard Educational Review, 17, 102–150. Nelson, J. B., & Lamoureux, J. A. (2015). Contextual
Mowrer, O. H. (1960). Learning theory and behavior. control of conditioning is not affected by extinc-
New York: Wiley. tion in a behavioral task with humans. Learning &
Mowrer, O. H., & Lamoreaux, R. R. (1942). Avoid- Behavior, 43, 163–178.
ance conditioning and signal duration—a study Nelson, J. B., & Sanjuan, M. C. (2006). A context-spe-
of secondary motivation and reward. Psychological cific latent-inhibition effect in a human conditioned-
Monographs, 54 (Whole No. 247). suppression task. Quarterly Journal of Experimental
Muszynski, N. M., & Couvillon, P. A. (2015). Rela- Psychology, 59, 1003–1020.
tional learning in honeybees (Apis mellifera): Oddity Neuringer, A. (1993). Reinforced variation and selec-
and nonoddity discrimination. Behavioural Processes, tion. Animal Learning & Behavior, 21, 83–91.
115, 81–93. Neuringer, A. (2004). Reinforced variability in animals
Myers, K. P., & Hall, W. G. (2001). Effects of prior and people: Implications for adaptive action. Ameri-
experience with dehydration and water on the time can Psychologist, 59, 891–906.
course of dehydration-induced drinking in weanling Nevin, J. A. (1998). Choice and behavior momentum.
rats. Developmental Psychobiology, 38, 145–153. In W. T. O’Donohue (Ed.), Learning and behavior
Myers, K. P., & Sclafani, A. (2001). Conditioned en- therapy (pp. 230–251). Needham Heights, MA: Allyn
hancement of flavor evaluation reinforced by intra- and Bacon.
gastric glucose: I. Intake acceptance and preference Nevin, J. A., & Grace, R. C. (2000). Behavioral mo-
analysis. Physiology and Behavior, 74, 481–493. mentum and the Law of Effect. Behavioral and Brain
Myers, K. P., & Sclafani, A. (2001). Conditioned Sciences, 23, 73–130.
enhancement of flavor evaluation reinforced by Nevin, J. A., Tota, M. E., Torquato, R. D., & Shull, R.
intragastric glucose: II. Taste reactivity analysis. L. (1990). Alternative reinforcement increases resis-
Physiology and Behavior, 74, 495–505. tance to change: Pavlovian or operant contingen-
Mystkowski, J. L., Craske, M. G., & Echiverri, A. M. cies? Journal of the Experimental Analysis of Behavior,
(2002). Treatment context and return of fear in spi- 53, 359–379.
der phobia. Behavior Therapy, 33, 399–416. Newell, A., & Simon, H. A. (1961). Computer simula-
Mystkowski, J. L., Craske, M. G., Echiverri, A. M., & tion of human thinking. Science, 134, 2011–2017.
Labus, J. S. (2006). Mental reinstatement of context Nosofsky, R. M. (1987). Attention and learning
and return of fear in spider-fearful participants. processes in the identification and categorization of
Behavior Therapy, 37, 49–60. integral stimuli. Journal of Experimental Psychology:
N Learning, Memory, and Cognition, 13, 87–108.
Nader, K. (2003). Memory traces unbound. Trends in O
Neurosciences, 26, 65–72. O’Brien, C. P., Ehrman, R., & Ternes, J. W. (1986).
Nader, K., & Hardt, O. (2009). A single standard Classical conditioning in human opioid dependence.
for memory: The case for reconsolidation. Nature In S. R. Goldberg & I. P. Stolerman (Eds.), Behavioral
Reviews Neuroscience, 10, 224–234. analysis of drug dependence (pp. 329–356). New York:
Academic Press.
References  513

Odling-Smee, F. J. (1975). The role of background their role in response selection in dogs. Journal of
stimuli during Pavlovian conditioning. Quarterly Comparative and Physiological Psychology, 76, 478–482.
Journal of Experimental Psychology A, 27, 201–209. P
O’Donohue, W. T. (Ed.). (1998). Learning and behavior Packard, M. G. (2001). On the neurobiology of mul-
therapy. Boston, MA: Allyn & Bacon, Inc. tiple memory systems: Tolman versus Hull, system
Öhman, A., & Mineka, S. (2001). Fears, phobias, and interactions, and the emotion-memory bank. Cogni-
preparedness: Toward an evolved module of fear tive Processing, 2, 3–24.
and fear learning. Psychological Review, 108, 483–522. Packard, M. G., & McGaugh, J. L. (1996). Inactiva-
Öhman, A., Dimberg, U., & Ost, L.-G. (1985). Animal tion of hippocampus or caudate nucleus with
and social phobias: Biological constraints on learned lidocaine differentially affects expression of place
fear responses. In S. Reiss & R. R. Bootzin (Eds.), and response learning. Neurobiology of Learning and
Theoretical issues in behavior therapy (pp. 123–178). Memory, 65, 65–72.
San Diego, CA: Academic Press. Paletta, M. S., & Wagner, A. R. (1986). Development of
O’Keefe, J., & Nadel, L. (1978). The hippocampus as a context-specific tolerance to morphine: Support for a
cognitive map. Oxford: Oxford University Press. dual-process interpretation. Behavioral Neuroscience,
O’Keefe, J., & Speakman, A. (1987). Single unit activ- 100, 611–623.
ity in the rat hippocampus during a spatial memory Palmerino, C. C., Rusiniak, K. W., & Garcia, J. (1980).
task. Experimental Brain Research, 68, 1–27. Flavor-illness aversions: The peculiar roles of
Olmstead, M. C., Lafond, M. V., Everitt, B. J., & Dick- odor and taste in memory for poison. Science, 208,
inson, A. (2001). Cocaine seeking by rats is a goal- 753–755.
directed action. Behavioral Neuroscience, 115, 394–402. Panililio, L. V., Weiss, S. J., & Schnidler, C. W. (1996).
Olton, D. S. (1978). Characteristics of spatial memory. Cocaine self-administration increased by com-
In S. H. Hulse, H. Fowler, & W. K. Honig (Eds.), pounding discriminative stimuli. Psychopharmacol-
Cognitive processes in animal behavior (pp. 341–373). ogy, 125, 202–208.
Hillsdale, NJ: Erlbaum. Papini, M. R., & Bitterman, M. E. (1990). The role of
Olton, D. S., & Papas, B. C. (1979). Spatial memory contingency in classical conditioning. Psychological
and hippocampal function. Neuropsychologia, 17, Review, 97, 396–403.
669–682. Parker, L. (1982). Nonconsummatory and consum-
Olton, D. S., & Samuelson, R. J. (1976). Remembrance matory behavioral CRs elicited by lithium- and
of places passed: Spatial memory in rats. Journal of amphetamine-paired flavors. Learning and Motiva-
Experimental Psychology: Animal Behavior Processes, 2, tion, 13, 281–303.
97–116. Parker, L. A. (1988). Positively reinforcing drugs may
Olton, D. S., Collison, C., & Werz, M. A. (1977). Spatial produce a different kind of CTA than drugs which
memory and radial arm maze performance of rats. are not positively reinforcing. Learning and Motiva-
Learning and Motivation, 8, 289–314. tion, 19, 207–220.
O’Reilly, R. C., & Rudy, J. W. (2001). Conjunctive Parker, L. A. (1995). Rewarding drugs produce taste
representations in learning and memory: Principles avoidance, but not taste aversion. Neuroscience and
of cortical and hippocampal function. Psychological Biobehavioral Reviews, 19, 143–151.
Review, 108, 311–345. Parker, L. A. (1998). Emetic drugs produce condi-
Overmier, J. B., & Lawry, J. A. (1979). Pavlovian tioned rejection reactions in the taste reactivity test.
conditioning and the mediation of behavior. In G. H. Journal of Psychophysiology, 12(Supp 1), 3–13.
Bower (Ed.), The psychology of learning and motivation Pavlov, I. P. (1927). Conditioned reflexes (G. V. Anrep,
(Vol. 13, pp. 1–55). New York: Academic Press. translation). London: Oxford University Press.
Overmier, J. B., & Leaf, R. C. (1965). Effects of discrim- Pearce, J. M. (1987). A model for stimulus generaliza-
inative Pavlovian fear conditioning upon previously tion in Pavlovian conditioning. Psychological Review,
or subsequently acquired avoidance responding. 94, 61–73.
Journal of Comparative and Physiological Psychology, 60, Pearce, J. M. (1994). Discrimination and categoriza-
213–217. tion. In N. J. Mackintosh (Ed.), Animal learning and
Overmier, J. B., & LoLordo, V. M. (1998). Learned cognition (pp. 109–134). San Diego: Academic Press.
helplessness. In W. O’Donohue (Ed.), Learning and Pearce, J. M. (1994). Similarity and discrimination: A
behavior therapy (pp. 352–373). Needham Heights, selective review and a connectionist model. Psycho-
MA: Allyn & Bacon. logical Review, 101, 587–607.
Overmier, J. B., & Murison, R. (2013). Restoring Pearce, J. M. (1997). Animal learning and cognition (2nd
psychology’s role in peptic ulcer. Applied Psychology: ed.). East Sussex, UK: Psychology Press Ltd.
Health and Well-Being, 5, 5–27. Pearce, J. M. (2002). Evaluation and development of a
Overmier, J. B., & Seligman, M. E. P. (1967). Effects connectionist theory of configural learning. Animal
of inescapable shock upon subsequent escape and Learning & Behavior, 30, 73–95.
avoidance behavior. Journal of Comparative and Physi- Pearce, J. M. (2008). Animal learning and cognition (3rd
ological Psychology, 63, 23–33. ed.). Hove, UK: Psychology Press.
Overmier, J. B., Bull, J. A., & Trapold, M. A. (1971).
Discriminative cue properties of different fears and
514  References

Pearce, J. M., & Bouton, M. E. (2001). Theories of Perruchet, P. (2015). Dissociating conscious expectan-
associative learning in animals. Annual Review of cies from automatic link formation in associative
Psychology, 52, 111–139. learning: A review on the so-called Perruchet effect.
Pearce, J. M., & Hall, G. (1978). Overshadowing the Journal of Experimental Psychology: Animal Learning
instrumental conditioning of a lever-press response and Cognition, 41, 105–127.
by a more valid predictor of the reinforcer. Journal of Perusini, J. N., & Fanselow, M. S. (2015). Neurobehav-
Experimental Psychology: Animal Behavior Processes, 4, ioral perspectives on the distinction between fear
356–367. and anxiety. Learning & Memory, 22, 417–425.
Pearce, J. M., & Hall, G. (1980). A model for Pavlovian Peterson, G. B., & Trapold, M. A. (1980). Effects of
learning: Variations in the effectiveness of condi- altering outcome expectancies on pigeons’ delayed
tioned but not of unconditioned stimuli. Psychologi- conditional discrimination performance. Learning
cal Review, 87, 532–552. and Motivation, 11, 267–288.
Pearce, J. M., & Mackintosh, N. J. (2010). Two theories Pietrewicz, A. T., & Kamil, A. C. (1981). Search im-
of attention: A review and a possible integration. In ages and the detection of cryptic prey: An operant
C. J. Mitchell & M. E. Le Pelley (Eds.), Attention and approach. In A. C. Kamil & T. D. Sargent (Eds.),
associative learning (pp. 11–39). New York: Oxford Foraging behavior: Ecological, ethological, and psycho-
University Press. logical approaches (pp. 311–331). New York: Garland
Pearce, J. M., & Redhead, E. S. (1993). The influence STPM Press.
of an irrelevant stimulus on two discriminations. Pinel, J. P. J., & Mana, M. J. (1989). Adaptive interac-
Journal of Experimental Psychology: Animal Behavior tions of rats with dangerous inanimate objects:
Processes, 19, 180–190. Support for a cognitive theory of defensive behavior.
Pearce, J. M., Good, M. A., Jones, P. M., & McGregor, In R. J. Blanchard, P. F. Brain, D. C. Blanchard, & S.
A. (2004). Transfer of spatial behavior between Parmigiani (Eds.), Ethoexperimental approaches to the
different environments: Implications for theories of study of behavior (Vol. 48, pp. 137–150). New York:
spatial learning and for the role of the hippocampus Springer-Verlag.
in spatial learning. Journal of Experimental Psychology: Plaisted, K. (1997). The effect of interstimulus interval
Animal Behavior Processes, 30, 135–147. on the discrimination of cryptic targets. Journal of
Pearce, J. M., Ward-Robinson, J., Good, M., Fussell, C., Experimental Psychology: Animal Behavior Processes,
& Aydin, A. (2001). Influence of a beacon on spatial 23, 248–259.
learning based on the shape of the test environment. Plath, J. A., Felsenberg, J., & Eisenhardt, D. (2012).
Journal of Experimental Psychology: Animal Behavior Reinstatement in honeybees is context-dependent.
Processes, 27, 329–344. Learning & Memory, 19, 543–549.
Peck, C. A., & Bouton, M. E. (1990). Context and per- Poling, A., Nickel, M., & Alling, K. (1990). Free birds
formance in aversive-to-appetitive and appetitive- aren’t fat: Weight gain in captured wild pigeons
to-aversive transfer. Learning and Motivation, 21, maintained under laboratory conditions. Journal of
1–31. the Experimental Analysis of Behavior, 53, 423–424.
Pecoraro, N. C., Timberlake, W. D., & Tinsley, M. Posner, M. I., & Keele, S. W. (1968). On the genesis of
(1999). Incentive downshifts evoke search reper- abstract ideas. Journal of Experimental Psychology, 77,
toires in rats. Journal of Experimental Psychology: 353–363.
Animal Behavior Processes, 25, 153–167. Posner, M. I., & Snyder, C. R. R. (1975). Facilitation
Peissig, J. J., & Goode, T. (2012). The recognition of ro- and inhibition in the processing of signals. In P. M.
tated objects in animals. In O. F. Lazareva, T. Shimizu, Rabbitt & S. Dornic (Eds.), Attention and performance
& E. A. Wasserman (Eds.), How animals see the world: (Vol. 5, pp. 669–682). San Diego, CA: Academic
Comparative behavior, biology, and evolution of vision (pp. Press.
233–246). New York: Oxford University Press. Postman, L. (1947). The history and present status of
Peissig, J. J., Young, M. E., Wasserman, E. A., & the law of effect. Psychological Bulletin, 44, 489–563.
Biederman, I. (2000). Seeing things from a differ- Postman, L., & Underwood, B. J. (1973). Critical issues
ent angle: The pigeon’s recognition of single geons in interference theory. Memory and Cognition, 1,
rotated in depth. Journal of Experimental Psychology: 19–40.
Animal Behavior Processes, 26, 115–132. Postman, L., Stark, K., & Fraser, J. (1968). Temporal
Perin, C. T. (1942). Behavior potentiality as a joint changes in interference. Journal of Verbal Learning and
function of the amount of training and the degree of Verbal Behavior, 7, 672–694.
hunger at the time of extinction. Journal of Experi- Poulos, C. X., Wilkinson, D. A., & Cappell, H. (1981).
mental Psychology, 30, 93–113. Homeostatic regulation and Pavlovian condition-
Perkins, C. C., Jr., & Weyant, R. G. (1958). The interval ing in tolerance to amphetamine-induced anorexia.
between training and test trials as a determiner of Journal of Comparative and Physiological Psychology, 95,
the slope of generalization gradients. Journal of Com- 735–746.
parative and Physiological Psychology, 51, 596–600. Powell, R. A., Digdon, N., Harris, B., & Smithson, C.
Perruchet, P. (1985). A pitfall for the expectancy theory (2014). Correcting the record on Watson, Rayner,
of human eyelid conditioning. Pavlovian Journal of and Little Albert: Albert Barger as “Psychology’s
Biological Science, 20, 163–170. Lost Boy.” American Psychologist, 69, 600–611.
References  515

Power, A. E., Berlau, D. J., McGaugh, J. L., & Steward, for acute and chronic tolerance. Psychological Review,
O. (2006). Anisomycin infused into the hippocam- 104, 170–193.
pus fails to block “reconsolidation” but impairs ex- Randich, A., & LoLordo, V. M. (1979). Associative
tinction: The role of re-exposure duration. Learning and nonassociative theories of the CS preexposure
& Memory, 13, 27–34. phenomenon: Implications for Pavlovian condition-
Powley, T. L. (1977). The ventromedial hypothalamic ing. Psychological Bulletin, 86, 523–548.
syndrome, satiety, and a cephalic phase hypothesis. Rankin, C. H., Abrams, T., Barry, R. J., Bhatnagar, S.,
Psychological Review, 84, 89–126. Clayton, D. F., Colombo, J., Coppola, G., Geyer, M.
Prados, J. (2011). Blocking and overshadowing in A., Glanzman, D. L., Marsland, S., McSweeney, F. K.,
human geometry learning. Journal of Experimental Wilson, D. A., Wu, C. & Thompson, R. F. (2009). Ha-
Psychology: Animal Behavior Processes, 37, 121–126. bituation revisited: An updated and revised descrip-
Premack, D. (1959). Toward empirical behavior laws: tion of the behavioral characteristics of habituation.
I. Positive reinforcement. Psychological Review, 66, Neurobiology of Learning and Memory, 92, 135–138.
219–233. Reberg, D. (1972). Compound tests for excitation in
Premack, D. (1962). Reversibility of the reinforcement early acquisition and after prolonged extinction of
relation. Science, 136, 255–257. conditioned suppression. Learning and Motivation, 3,
Premack, D. (1963a). Prediction of the comparative 246–258.
reinforcement values of running and drinking. Sci- Redhead, E. S., & Pearce, J. M. (1995). Similarity and
ence, 139, 1062–1063. discrimination learning. Quarterly Journal of Experi-
Premack, D. (1963b). Rate differential reinforcement mental Psychology B: Comparative and Physiological
in monkey manipulation. Journal of the Experimental Psychology, 48B, 46–66.
Analysis of Behavior, 6, 81–89. Redhead, E. S., Roberts, A., Good, M., & Pearce, J.
Premack, D. (1965). Reinforcement theory. In D. M. (1997). Interaction between piloting and beacon
Levine (Ed.), Nebraska symposium on motivation homing by rats in a swimming pool. Journal of
(Vol. 13, pp. 123–188). Lincoln, NE: University of Experimental Psychology: Animal Behavior Processes,
Nebraska Press. 23, 340–350.
Premack, D. (1971a). Catching up with common sense Redish, A. D., Jensen, S., Johnson, A., & Kurth-Nelson,
or two sides of a generalization: Reinforcement and Z. (2007). Reconciling reinforcement learning mod-
punishment. In R. Glaser (Ed.), The nature of rein- els with behavioral extinction and renewal: Implica-
forcement (pp. 121–150). New York: Academic Press. tions for addiction, relapse, and problem gambling.
Premack, D. (1971b). Language in chimpanzee? Sci- Psychological Review, 114, 784–805.
ence, 172, 808–822. Reid, P. J., & Shettleworth, S. J. (1992). Detection of
Premack, D. (1983). Animal cognition. Annual Review cryptic prey: Search image or search rate? Journal of
of Psychology, 34, 351–362. Experimental Psychology: Animal Behavior Processes,
Preston, K. L., Umbricht, A., Wong, C. J., & Epstein, D. 18, 273–286.
H. (2001). Shaping cocaine abstinence by successive Reilly, S., & Schachtman, T. R. (1987). The effects of ITI
approximation. Journal of Consulting and Clinical fillers in autoshaping. Learning and Motivation, 18,
Psychology, 69, 643–654. 202–219.
Pryor, K., & Ramierez, K. (2014). Modern animal Reiss, S., & Wagner, A. R. (1972). CS habituation
training: A transformative technology. In F. K. Mc- produces a “latent inhibition effect” but no active
Sweeney & E. S. Murphy (Eds.), The Wiley Blackwell “conditioned inhibition.” Learning and Motivation, 3,
handbook of operant and classical conditioning (pp. 237–245.
455–482). West Sussex, UK: John Wiley & Sons. Renner, K. E. (1964). Delay of reinforcement: A histori-
R cal review. Psychological Bulletin, 61, 341–361.
Rachlin, H. (1974). Self-control. Behaviorism, 3, 94–107. Reppucci, C. J., & Petrovich, G. D. (2012). Learned
Rachlin, H. (1976). Behavior and learning. San Fran- food-cue stimulates persistent feeding in sated rats.
cisco: W.H. Freeman. Appetite, 59, 437–447.
Rachlin, H., & Baum, W. M. (1972). Effects of alterna- Rescorla, R. A. (1966). Predictability and number of
tive reinforcement: Does the source matter? Journal pairings in Pavlovian fear conditioning. Psychonomic
of the Experimental Analysis of Behavior, 18, 231–241. Science, 4, 383–384.
Rachlin, H., & Green, L. (1972). Commitment, choice Rescorla, R. A. (1967a). Inhibition of delay in Pavlov-
and self-control. Journal of the Experimental Analysis ian fear conditioning. Journal of Comparative and
of Behavior, 17, 15–22. Physiological Psychology, 64, 114–120.
Rachlin, H., Green, L., Kagel, J. H., & Battalio, R. C. Rescorla, R. A. (1967b). Pavlovian conditioning and
(1976). Economic demand theory and psychological its proper control procedures. Psychological Review,
studies of choice. In G. Bower (Ed.), The psychology 74, 71–80.
of learning and motivation. Vol. 10 (pp. 129–154). New Rescorla, R. A. (1968a). Pavlovian conditioned fear in
York: Academic Press. Sidman avoidance learning. Journal of Comparative
Ramsay, D. S., & Woods, S. C. (1997). Biological and Physiological Psychology, 65, 55–60.
consequences of drug administration: Implications Rescorla, R. A. (1968b). Probability of shock in the
presence and absence of CS in fear conditioning.
516  References

Journal of Comparative and Physiological Psychology, Rescorla, R. A. (1999b). Partial reinforcement reduces
66, 1–5. the associative change produced by nonreinforce-
Rescorla, R. A. (1969a). Establishment of a positive ment. Journal of Experimental Psychology: Animal
reinforcer through contrast with shock. Journal of Behavior Processes, 25, 403–414.
Comparative and Physiological Psychology, 67, 504–509. Rescorla, R. A. (2000). Extinction can be enhanced by a
Rescorla, R. A. (1969b). Pavlovian conditioned inhibi- concurrent excitor. Journal of Experimental Psychology:
tion. Psychological Bulletin, 72, 77–94. Animal Behavior Processes, 26, 251–260.
Rescorla, R. A. (1970). Reduction in the effectiveness Rescorla, R. A. (2001). Experimental extinction. In R.
of reinforcement after prior excitatory conditioning. R. Mowrer & S. B. Klein (Eds.), Handbook of contem-
Learning and Motivation, 1, 372–381. porary learning theories (pp. 119–154). Mahwah, NJ:
Rescorla, R. A. (1971). Summation and retardation Erlbaum.
tests of latent inhibition. Journal of Comparative and Rescorla, R. A. (2003). Protection from extinction.
Physiological Psychology, 75, 77–81. Learning & Behavior, 31, 124–132.
Rescorla, R. A. (1972). Informational variables in Rescorla, R. A. (2006a). Deepened extinction from
Pavlovian conditioning. In G. H. Bower (Ed.), The compound stimulus presentation. Journal of Ex-
psychology of learning and motivation. (Vol. 6, pp. perimental Psychology: Animal Behavior Processes, 32,
1–46). New York: Academic Press. 135–144.
Rescorla, R. A. (1973). Effects of US habituation Rescorla, R. A. (2006b). Spontaneous recovery from
following conditioning. Journal of Comparative and overexpectation. Learning & Behavior, 34, 13–20.
Physiological Psychology, 82, 137–143. Rescorla, R. A. (2007). Renewal after overexpectation.
Rescorla, R. A. (1974). Effect of inflation of the un- Learning & Behavior, 35, 19–26.
conditioned stimulus value following conditioning. Rescorla, R. A., & Colwill, R. M. (1989). Associa-
Journal of Comparative and Physiological Psychology, 86, tions with anticipated and obtained outcomes in
101–106. instrumental learning. Animal Learning & Behavior,
Rescorla, R. A. (1978). Some implications of a cogni- 17, 291–303.
tive prespective on Pavlovian conditioning. In S. Rescorla, R. A., & Durlach, P. J. (1981). Within-event
S. Hulse, H. Fowler, & K. Honig (Eds.), Cognitive learning in Pavlovian conditioning. In N. E. Spear &
processes in animal behavior. Hillsdale, NJ: Lawrence R. R. Miller (Eds.), Information processing in animals:
Erlbaum Associates, Inc. Memory mechanisms (pp. 81–112). Hillsdale, NJ: Law-
Rescorla, R. A. (1979). Aspects of the reinforcer rence Erlbaum Associates, Inc.
learned in second-order Pavlovian conditioning. Rescorla, R. A., & Heth, C. D. (1975). Reinstatement
Journal of Experimental Psychology: Animal Behavior of fear to an extinguished conditioned stimulus.
Processes, 5, 79–95. Journal of Experimental Psychology: Animal Behavior
Rescorla, R. A. (1980). Pavlovian second order condi- Processes, 1, 88–96.
tioning: Studies in associative learning. Hillsdale, NJ: Rescorla, R. A., & Holland, P. C. (1977). Associations
Lawrence Erlbaum Associates, Inc. in Pavlovian conditioned inhibition. Learning and
Rescorla, R. A. (1985). Conditioned inhibition and Motivation, 8, 429–447.
facilitation. In R. R. Miller & N. E. Spear (Eds.), Rescorla, R. A., & Lolordo, V. M. (1965). Inhibition of
Information processing in animals: Conditioned inhibi- avoidance behavior. Journal of Comparative and Physi-
tion (pp. 299–326). Hillsdale, NJ: Lawrence Erlbaum ological Psychology, 59, 406–412.
Associates, Inc. Rescorla, R. A., & Solomon, R. L. (1967). Two-process
Rescorla, R. A. (1986). Extinction of facilitation. Journal learning theory: Relationships between Pavlovian
of Experimental Psychology: Animal Behavior Processes, conditioning and instrumental learning. Psychologi-
12, 16–24. cal Review, 74, 151–182.
Rescorla, R. A. (1987). A Pavlovian analysis of goal- Rescorla, R. A., & Wagner, A. R. (1972). A theory of
directed behavior. American Psychologist, 42, 119–129. Pavlovian conditioning: Variations in the effective-
Rescorla, R. A. (1988a). Facilitation based on inhibi- ness of reinforcement and nonreinforcement. In A.
tion. Animal Learning & Behavior, 16, 169–176. H. Black & W. F. Prokasy (Eds.), Classical conditioning
Rescorla, R. A. (1988b). Pavlovian conditioning: It’s II. New York: Appleton-Century-Crofts.
not what you think it is. American Psychologist, 43, Revusky, S. H. (1967). Hunger level during food
151–160. consumption: Effects on subsequent preference.
Rescorla, R. A. (1991). Associative relations in instru- Psychonomic Science, 7, 109–110.
mental learning: The eighteenth Bartlett Memorial Revusky, S. H. (1968). Effects of thirst level during
Lecture. Quarterly Journal of Experimental Psychology, consumption of flavored water on subsequent
43B, 1–23. preference. Journal of Comparative and Physiological
Rescorla, R. A. (1994). Transfer of instrumental control Psychology, 66, 777–779.
mediated by a devalued outcome. Animal Learning & Revusky, S. H. (1971). The role of interference in asso-
Behavior, 22, 27–33. ciation over a delay. In W. K. Honig & P. H. R. James
Rescorla, R. A. (1999a). Associative changes in ele- (Eds.), Animal memory. New York: Academic Press.
ments and compounds when the other is reinforced. Riccio, D. C., Rabinowitz, V. C., & Axelrod, S. (1994).
Journal of Experimental Psychology: Animal Behavior Memory: When less is more. American Psychologist,
Processes, 25, 247–255. 49, 917–926.
References  517

Riccio, D. C., Richardson, R., & Ebner, D. L. (1984). short-term memory. Journal of Experimental Psychol-
Memory retrieval deficits based upon altered ogy: Animal Behavior Processes, 4, 219–236.
contextual cues: A paradox. Psychological Bulletin, 96, Roberts, W. A., & Mazmanian, D. S. (1988). Concept
152–165. learning at different levels of abstraction by pigeons,
Riccio, D. C., Urda, M., & Thomas, D. R. (1966). monkeys, and people. Journal of Experimental Psy-
Stimulus control in pigeons based on proprioceptive chology: Animal Behavior Processes, 14, 247–260.
stimuli from floor inclination. Science, 153, 434–436. Robinson, T. E., & Berridge, K. C. (1993). The neural
Richter, C. P. (1927). Animal behavior and internal basis of drug craving: An incentive-sensitization the-
drives. Quarterly Review of Biology, 2, 307–343. ory of addiction. Brain Research Reviews, 18, 247–291.
Richter, C. P. (1936). Increased salt appetite in adrenal- Robinson, T. E., & Berridge, K. C. (2003). Addiction.
ectomized rats. American Journal of Physiology, 115, Annual Review of Psychology, 54, 25–53.
155–161. Robinson, T. E., Flagel, S. B. (2009). Dissociating the
Richter, C. P., Holt, L. E., Jr., & Barelare, B., Jr. (1938). predictive and incentive motivational properties of
Nutritional requirements for normal growth and reward-related cues through the study of individual
reproduction in rats studied by the self-selection differences. Biological Psychiatry, 65, 869–873
method. American Journal of Physiology, 122, 734–744. Rodgers, W. L. (1967). Specificity of specific hungers.
Richter, J., Hamm, A. O., Pane-Farre, C. A., Gerlach, Journal of Comparative and Physiological Psychology,
A. L., Gloster, A. T., Wittchen, H-U., Lang, T., Alp- 64, 49–58.
ers, G. W., Helbig-Lang, S., Deckert, J., Fydrich, T., Rodrigo, T., Chamizo, V. D., McLaren, I. P., & Mack-
Fehm, L., Strohle, A., Kircher, T., & Arolt, V. (2012). intosh, N. J. (1997). Blocking in the spatial domain.
Dynamics of defensive reactivity in patients with Journal of Experimental Psychology: Animal Behavior
panic disorder and agoraphobia: Implications for Processes, 23, 110–118.
the etiology of panic disorder. Biological Psychiatry, Roelofs, K., Hagenaars, M. A., & Stins, J. (2010). Fac-
72, 512–520. ing freeze: Social threat induces bodily freeze in
Ricker, S. T., & Bouton, M. E. (1996). Reacquisition fol- humans. Psychological Science, 21, 1575–1581.
lowing extinction in appetitive conditioning. Animal Roitblat, H. L. (1980). Codes and coding processes
Learning & Behavior, 24, 423–436. in pigeon short-term memory. Animal Learning &
Rilling, M. (1977). Stimulus control and inhibitory Behavior, 8, 341–351.
processes. In W. K. Honig & J. E. R. Staddon (Eds.), Roitblat, H. L. (1987). Introduction to comparative cogni-
Handbook of operant behavior (pp. 432–480). Engle- tion. New York: W. H. Freeman.
wood Cliffs, NJ: Prentice-Hall. Romanes, G. J. (1882). Animal intelligence. London:
Ritchie, B. F., Aeschliman, B., & Peirce, P. (1950). Stud- Kegan, Paul, Trench and Co.
ies in spatial learning: VIII. Place performance and Rosas, J. M., & Alonso, G. (1996). Temporal discrimi-
acquisition of place dispositions. Journal of Compara- nation and forgetting of CS duration in conditioned
tive and Physiological Psychology, 43, 73–85. suppression. Learning and Motivation, 27, 43–57.
Rizley, R. C., & Rescorla, R. A. (1972). Associations Rosas, J. M., & Alonso, G. (1997). Forgetting of the CS
in second-order conditioning and sensory precon- duration in rats: The role of retention interval and
ditioning. Journal of Comparative and Physiological training level. Learning and Motivation, 28, 404–423.
Psychology, 81, 1–11. Rosas, J. M., & Bouton, M. E. (1996). Spontaneous
Roberts, A. D. L., & Pearce, J. M. (1999). Blocking in recovery after extinction of a conditioned taste aver-
the Morris swimming pool. Journal of Experimental sion. Animal Learning & Behavior, 24, 341–348.
Psychology: Animal Behavior Processes, 25, 225–235. Rosas, J. M., Todd, T. P., and Bouton, M. E. (2013).
Roberts, S. (1981). Isolation of an internal clock. Context change and associative learning. Wiley Inter-
Journal of Experimental Psychology: Animal Behavior disciplinary Reviews: Cognitive Science, in press.
Processes, 7, 242–268. Rosellini, R. A., DeCola, J. P., & Warren, D. A. (1986).
Roberts, W. A. (1972). Short-term memory in the The effect of feedback stimuli on contextual fear
pigeon: Effects of repetition and spacing. Journal of depends upon the length of the intertrial interval.
Experimental Psychology, 94, 74–83. Learning and Motivation, 17, 229–242.
Roberts, W. A. (1980). Distribution of trials and inter- Rosellini, R. A., DeCola, J. P., Plonsky, M., Warren, D.
trial retention in delayed matching to sample with A., & Stilman, A. J. (1984). Uncontrollable shock pro-
pigeons. Journal of Experimental Psychology: Animal actively increases sensitivity to response-reinforcer
Behavior Processes, 6, 217–237. independence in rats. Journal of Experimental Psychol-
Roberts, W. A. (1984). Some issues in animal spatial ogy: Animal Behavior Processes, 10, 346–359.
memory. In H. L. Roitblat, T. G. Bever, & H. S. Ter- Rosellini, R. A., Warren, D. A., & DeCola, J. P. (1987).
race (Eds.), Animal cognition. Hillsdale, NJ: Lawrence Predictability and controllability: Differential effects
Erlbaum Associates, Inc. upon contextual fear. Learning and Motivation, 18,
Roberts, W. A. (1998). Principles of animal cognition. 392–420.
Boston: McGraw Hill. Rosen, J. C., & Leitenberg, H. (1982). Bulimia nervosa:
Roberts, W. A., & Grant, D. S. (1978). An analysis Treatment with exposure and response prevention.
of light-induced retroactive inhibition in pigeon Behavior Therapy, 13, 117–124.
518  References

Ross, R. T., & Holland, P. C. (1981). Conditioning of schedule-induction procedure in free feeding rats.
simultaneous and serial feature-positive discrimina- Alcohol and Drug Research, 7, 461–469.
tions. Animal Learning & Behavior, 9, 293–303. Sansa, J., Rodrigo, T., Santamaria, J., Manteigia, R. D.,
Rovee-Collier, C. (1987). Learning and memory in & Chamizo, V. D. (2009). Conditioned inhibition in
infancy. In J. D. Osofsky (Ed.), Handbook of infant the spatial domain. Journal of Experimental Psychol-
development. (pp. 98–148). New York: Wiley. ogy: Animal Behavior Processes, 35, 566–577.
Rovee-Collier, C. (1999). The development of infant Sargisson, R. J., & White, K. G. (2001). Generalization
memory. Current Directions in Psychological Science, of delayed matching to sample following training at
8, 80–85. different delays. Journal of the Experimental Analysis
Rozin, P. (1967). Specific aversions as a component of of Behavior, 75, 1–14.
specific hungers. Journal of Comparative and Physi- Saunders, B. T., & Robinson, T. E. (2013). Individual
ological Psychology, 64, 237–242. variation in resisting temptation: Implications for
Rozin, P., & Kalat, J. W. (1971). Specific hungers and addiction. Neuroscience and Biobehavioral Reviews, 37,
poison avoidance as adaptive specializations of 1955–1975.
learning. Psychological Review, 78, 459–486. Save, E., Poucet, B., & Thinus-Blanc, C. (1998). Land-
Rozin, P., & Rodgers, W. (1967). Novel-diet prefer- mark use in the cognitive map in the rat. In S. Healy
ences in vitamin deficient rats and rats recovered (Ed.), Spatial representation in animals (pp. 119–132).
from vitamin deficiency. Journal of Comparative and New York: Oxford University Press.
Physiological Psychology, 63, 421–428. Scalera, G., & Bavieri, M. (2009). Role of conditioned
Rudolph, R. L., & Van Houten, R. (1977). Auditory taste aversion on the side effects of chemotherapy
stimulus control in pigeons: Jenkins and Harrison in cancer patients. In S. Reilly & T. R. Schachtman
(1960) revisited. Journal of the Experimental Analysis of (Eds.), Conditioned taste aversion: Behavioral and neural
Behavior, 27, 327–330. processes (pp. 513–541). New York: Oxford Univer-
Rudy, J. W., & O’Reilly, R. C. (1999). Contextual fear sity Press.
conditioning, conjunctive representations, pattern Schachtman, T. R., Brown, A. M., Gordon, E. L., Cat-
completion, and the hippocampus. Behavioral Neuro- terson, D. A., & Miller, R. R. (1987). Mechanisms
science, 113, 867–880. underlying retarded emergence of conditioned re-
Rumelhart, D. E. (1989). The architecture of mind: A sponding following inhibitory training: Evidence for
connectionist approach. In M. I. Posner (Ed.), Foun- the comparator hypothesis. Journal of Experimental
dations of cognitive science. (pp. 133–159). Cambridge, Psychology: Animal Behavior Processes, 13, 310–322.
MA: MIT Press. Schafe, G. E., & LeDoux, J. E. (2000). Memory con-
Rumelhart, D. E., & McClelland, J. L. (Eds.). (1986a). solidation in auditory Pavlovian fear conditioning
Parallel distributed processing. Explorations in the requires protein synthesis and protein kinase A in
microstructure of cognition. Vol. 1: Foundations. Cam- the amygdala. Journal of Neuroscience, 20, RC96, 1–5.
bridge, MA: MIT Press. Schleidt, W. M. (1961). Reaction of turkeys to flying
Rumelhart, D. E., & McClelland, J. L. (Eds.). (1986b). predators and experiment to analyse their AAM’s.
Parallel distributed processing. Explorations in the Zeitschrift fur Tierpsychologie, 18, 543–560.
microstructure of cognition. Vol. 2: Psychological and Schmajuk, N. A., & Holland, P. C. (Eds.). (1998).
biological models. Cambridge, MA: MIT Press. Occasion setting: Associative learning and cognition in
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. animals. Washington, DC: American Psychological
(1986). Learning representations by back-propagat- Association.
ing errors. Nature, 323, 533–536. Schmajuk, N. A., Lamoureux, J. A., & Holland, P. C.
Rusiniak, K. W., Hankins, W. G., Garcia, J., & Brett, (1998). Occasion setting: A neural network ap-
L. P. (1979). Flavor-illness aversions: Potentiation of proach. Psychological Review, 105, 3–32.
odor by taste in rats. Behavioral and Neural Biology, Schneider, W., & Shiffrin, R. M. (1977). Controlled and
25, 1–17. automatic human information processing: I. Detec-
S tion, search, and attention. Psychological Review, 84,
Saksida, L. M. (1999). Effects of similarity and experi- 1–66.
ence on discrimination learning: A nonassociative Schoenfeld, W. N. (1950). An experimental approach
connectionist model of perceptual learning. Journal to anxiety, escape, and avoidance behavior. In J. Z. P.
of Experimental Psychology: Animal Behavior Processes, H. Hock (Ed.), Anxiety. New York: Grune & Stratton.
25, 308–323. Schull, J. (1979). A conditioned opponent theory of
Saksida, L. M., & Wilkie, D. M. (1994). Time-of-day Pavlovian conditioning and habituation. In G. H.
discrimination by pigeons. Animal Learning & Behav- Bower (Ed.), The psychology of learning and motivation
ior, 22, 143–154. (pp. 57–90). New York: Academic Press.
Salkovskis, P. M., Clark, D. M., & Gelder, M. G. (1996). Schultz, W. (2006). Behavior theories and the neuro-
Cognition-behaviour links in the persistence of physiology of reward. Annual Review of Psychology,
panic. Behaviour Research and Therapy, 34, 453–458. 57, 87–115.
Samson, H. H., & Pfeffer, A. O. (1987). Initia- Sclafani, A. (1995). How food preferences are learned:
tion of ethanol-maintained responding using a Laboratory animal models. Proceedings of the Nutri-
tion Society, 54, 419–427.
References  519

Sclafani, A. (1997). Learned controls of ingestive differences in delay discounting: Relation to intel-
behavior. Appetite, 29, 153–158. ligence, working memory, and anterior prefrontal
Scobie, S. R. (1972). Interaction of an aversive Pav- cortex. Psychological Science, 19, 904–911.
lovian conditional stimulus with aversively and Shanks, D. R. (1985). Forward and backward blocking
appetitively motivated operants in rats. Journal of in human contingency judgment. Quarterly Journal of
Comparative and Physiological Psychology, 79, 171–188. Experimental Psychology B, 37B, 1–21.
Seaman, S. F. (1985). Growth of morphine tolerance: Shanks, D. R. (1991). Categorization by a connection-
The effect of dose and interval between doses. In F. ist network. Journal of Experimental Psychology: Learn-
R. Brush & J. B. Overmier (Eds.), Affect, conditioning ing, Memory, and Cognition, 17, 433–443.
and cognition: Essays on the determinants of behavior Shanks, D. R. (1995). The psychology of associative learning.
(pp. 249–262). Hillsdale, NJ: Lawrence Erlbaum As- Cambridge, England: Cambridge University Press.
sociates, Inc. Shanks, D. R. (2010). Learning: From association to
Sechenov, I. M. (1965). Reflexes of the brain (Originally cognition. Annual Review of Psychology, 61, 273–301.
published, 1863 ed.). Cambridge, MA: MIT Press. Shanks, D. R., & Darby, R. J. (1998). Feature- and rule-
Seeley, R. J., Ramsay, D. S., & Woods, S. C. (1997). based generalization in human associative learning.
Regulation of food intake: Interactions between Journal of Experimental Psychology: Animal Behavior
learning and physiology. In M. E. Bouton & M. S. Processes, 24, 405–415.
Fanselow (Eds.), Learning, motivation, and cognition: Shanks, D. R., Holyoak, K. J., & Medin, D. L. (Eds.).
The functional behaviorism of Robert C. Bolles (pp. (1996). The Psychology of Learning and Motivation (Vol.
99–115). Washington, DC: American Psychological 34): Causal Learning. San Diego, CA: Academic Press.
Association. Shanks, D. R., Lopez, F. J., Darby, R. J., & Dickinson,
Seligman, M. E. (1970). On the generality of the laws A. (1996). Distinguishing associative and probabi-
of learning. Psychological Review, 77, 406–418. listic contrast theories of human contingency judg-
Seligman, M. E. (1971). Phobias and preparedness. ments. In D. R. Shanks, K. J. Holyoak, & D. E. Medin
Behavior Therapy, 2, 307–320. (Eds.), The psychology of learning and motivation: Vol.
Seligman, M. E. P. (1968). Chronic fear produced 34. Causal learning (pp. 265–312). San Diego, CA:
by unpredictable shock. Journal of Comparative and Academic Press.
Physiological Psychology, 66, 402–411. Sheffield, F. D. (1965). Relation between classical
Seligman, M. E. P. (1975). Helplessness: On depression, conditioning and instrumental learning. In W. F.
development, and death. San Francisco: Freeman. Prokasy (Ed.), Classical conditioning (pp. 302–322).
Seligman, M. E. P. (1990). Learned optimism. New York: New York: Appleton-Century-Crofts.
A. A. Knopf. Sheffield, F. D., & Campbell, B. A. (1954). The role of
Seligman, M. E. P., & Johnston, J. C. (1973). A cogni- experience in the “spontaneous” activity of hungry
tive theory of avoidance learning. In F. J. McGuigan, rats. Journal of Comparative and Physiological Psychol-
& D. B. Lumsden (Eds.), Contemporary approaches to ogy, 47, 97–100.
conditioning and learning. Washington, DC: Winston. Sheffield, F. D., & Roby, T. B. (1950). Reward value of a
Seligman, M. E. P., & Maier, S. F. (1967). Failure to non-nutritive sweet taste. Journal of Comparative and
escape traumatic shock. Journal of Experimental Physiological Psychology, 43, 471–481.
Psychology, 74, 1–9. Sheffield, F. D., Wulff, J. J., & Barker, R. (1951). Reward
Seligman, M. E. P., Maier, S. F., & Solomon, R. L. value of copulation without sex drive reduction.
(1971). Unpredictable and uncontrollable aversive Journal of Comparative and Physiological Psychology,
events. In F. R. Brush (Ed.), Aversive conditioning and 44, 3–8.
learning (pp. 347–401). New York: Academic Press. Sheffield, V. F. (1949). Extinction as a function of
Seligman, M. E. P., Rosellini, R. A., & Kozak, M. J. partial reinforcement and distribution of practice.
(1975). Learned helplessness in the rat: Time course, Journal of Experimental Psychology, 39, 511–526.
immunization, and reversibility. Journal of Compara- Shepp, B. E., & Eimas, P. D. (1964). Intradimensional
tive and Physiological Psychology, 88, 542–547. and extradimensional shifts in the rat. Journal of
Sevenster, D., Beckers, T., & Kindt, M. (2012). Re- Comparative and Physiological Psychology, 57, 357–364.
trieval per se is not sufficient to trigger reconsolida- Sherry, D. F., & Schacter, D. L. (1987). The evolution of
tion of human fear memory. Neurobiology of Learning multiple memory systems. Psychological Review, 94,
and Memory, 97, 338–345. 439–454.
Sevenster, D., Beckers, T., & Kindt, M. (2012). Instruct- Shettleworth, S. J. (1975). Reinforcement and the or-
ed extinction differentially affects the emotional and ganization of behavior in golden hamsters: Hunger,
cognitive expression of associative fear memory. environment, and food reinforcement. Journal of
Psychophysiology, 49, 1426–1435. Experimental Psychology: Animal Behavior Processes, 1,
Shahan, T. A. (2010). Conditioned reinforcement and 56–87.
response strength. Journal of the Experimental Analysis Shettleworth, S. J. (1978). Reinforcement and the orga-
of Behavior, 93, 269–289. nization of behavior in golden hamsters: Pavlovian
Shamosh, N. A., DeYoung, C. G., Green, A. E., Reis, conditioning with food and shock unconditioned
D. L., Johnson, M. R., Conway, A. R. A., Engle, R. stimuli. Journal of Experimental Psychology: Animal
W., Braver, T. S., & Gray, J. R. (2008). Individual Behavior Processes, 4, 152–169.
520  References

Shettleworth, S. J. (1998). Cognition, evolution, and Siegel, S., & Ramos, B. M. C. (2002). Applying labora-
behavior. New York: Oxford University Press. tory research: Drug anticipation and the treatment
Shettleworth, S. J. (2002). Spatial behavior, food stor- of drug addiction. Experimental and Clinical Psycho-
ing, and the modular mind. In M. Bekoff, C. Allen, & pharmacology, 10, 162–183.
G. M. Burghardt (Ed.), The cognitive animal: Empirical Siegel, S., & Wagner, A. R. (1963). Extended acquisi-
and theoretical perspectives on animal cognition (pp. tion training and resistance to extinction. Journal of
123–128). Cambridge, MA: MIT Press. Experimental Psychology, 66, 308–310.
Shettleworth, S. J., & Juergensen, M. R. (1980). Siegel, S., Allan, L. G., & Eissenberg, T. (1994). Scan-
Reinforcement and the organization of behavior in ning and form-contingent color aftereffects. Journal
golden hamsters: Brain stimulation reinforcement of Experimental Psychology: General, 123, 91–94.
for seven action patterns. Journal of Experimental Siegel, S., Baptista, M. A. S., Kim, J. A., McDonald, R.
Psychology: Animal Behavior Processes, 6, 352–375. V., & Weise-Kelly, L. (2000). Pavlovian psychophar-
Shimp, C. P. (1966). Probabilistically reinforced choice macology: The associative basis of tolerance. Experi-
behavior in pigeons. Journal of the Experimental mental and Clinical Psychopharmacology, 8, 276–293.
Analysis of Behavior, 9, 443–455. Siegel, S., Hinson, R. E., Krank, M. D., & McCully, J.
Shors, T. J., & Mathew, P. R. (1998). NMDA receptor (1982). Heroin “overdose” death: Contribution of
antagonism in the lateral/basolateral but not central drug-associated environmental cues. Science, 216,
nucleus of the amygdala prevents the induction of 436–437.
facilitated learning in response to stress. Learning Silverstein, S. M., Spaulding, W. D., Menditto, A. A.,
and Memory, 5, 220–230. Savitz, A., Liberman, R. P., Berten, S., & Starobin, H.
Sidman, M. (1953). Avoidance conditioning with brief (2009). Attention shaping: A reward-based learn-
shock and no exteroceptive warning signal. Science, ing method to enhance skills training outcomes in
118, 157–158. schizophrenia. Schizophrenia Bulletin, 35, 222–232.
Sidman, M. (1990). Equivalence relations: Where do Simon, H. A., & Kaplan, C. A. (1989). Foundations of
they come from? In D. E. Blackman & H. Lejeune cognitive science. In M. I. Posner (Ed.), Foundations
(Eds.), Behavioral analysis in theory and practice: Con- of cognitive science. (pp. 1–47). Cambridge, MA: MIT
tributions and controversies (pp. 93–114). Hillsdale, NJ: Press.
Lawrence Erlbaum Associates, Inc. Singer, R. A., & Zentall, T. R. (2007). Pigeons learn to
Sidman, M. (2000). Equivalence relations and the re- answer the question “where did you just peck?” and
inforcement contingency. Journal of the Experimental can report peck location when unexpectedly asked.
Analysis of Behavior, 74, 127–146. Learning & Behavior, 35, 184–189.
Siegel, S. (1969). Generalization of latent inhibition. Skinner, B. F. (1931). The concept of the reflex in the
Journal of Comparative and Physiological Psychology, 69, description of behavior. Journal of General Psychology,
157–159. 5, 427–458.
Siegel, S. (1972). Conditioning of insulin-induced Skinner, B. F. (1935). Two types of conditioned reflex
glycemia. Journal of Comparative and Physiological and a pseudo-type. Journal of General Psychology, 12,
Psychology, 78, 233–241. 66–77.
Siegel, S. (1975). Evidence from rats that morphine Skinner, B. F. (1938). The behavior of organisms: An ex-
tolerance is a learned response. Journal of Compara- perimental analysis. New York: D. Appleton-Century
tive and Physiological Psychology, 89, 498–506. Company, Inc.
Siegel, S. (1977). Morphine tolerance acquisition as an Skinner, B. F. (1948). “Superstition” in the pigeon.
associative process. Journal of Experimental Psychol- Journal of Experimental Psychology, 38, 168–172.
ogy: Animal Behavior Processes, 3, 1–13. Skinner, B. F. (1956). A case history in scientific
Siegel, S. (1984). Pavlovian conditioning and heroin method. American Psychologist, 11, 221–233.
overdose: Reports by overdose victims. Bulletin of Skinner, B. F. (1971). Beyond freedom & dignity. New
the Psychonomic Society, 22, 428–430. York: Alfred A. Knopf.
Siegel, S. (1989). Pharmacological conditioning and Skinner, B. F. (1981). Selection by consequences. Sci-
drug effects. In A. J. Goudie & M. W. Emmett-Ogles- ence, 213, 501–504.
by (Eds.), Psychoactive drugs: Tolerance and sensi- Skinner, D. M., Etchegary, C. M., Ekert-Maret, E. C.,
tization. Contemporary neuroscience. (pp. 115–180). Baker, C. J., Harley, C. W., Evans, J. H., & Martin, G.
Clifton, NJ: Humana Press. M. (2003). An analysis of response, direction, and
Siegel, S. (2008). Learning and the wisdom of the place learning in an open field and T maze. Journal
body. Learning & Behavior, 36, 242–252. of Experimental Psychology: Animal Behavior Processes,
Siegel, S., & Allan, L. G. (1998). Learning and homeo- 29, 3–13.
stasis: Drug addiction and the McCollough effect. Slotnick, B. M., Westbrook, F., & Darling, F. M. C.
Psychological Bulletin, 124, 230–239. (1997). What the rat’s nose tells the rat’s mouth:
Siegel, S., & Ellsworth, D. W. (1986). Pavlovian condi- Long delay aversion conditioning with aqueous
tioning and death from apparent overdose of medi- odors and potentiation of taste by odors. Animal
cally prescribed morphine: A case report. Bulletin of Learning & Behavior, 25, 357–369.
the Psychonomic Society, 24, 278–280. Smith, B. H., & Cobey, S. C. (1994). The olfactory
memory of the honeybee Apis mellifera II. Blocking
References  521

between odorants and binary mixtures. Journal of Soltysik, S. S., Wolfe, G. E., Nicholas, T., Wilson, W. J.,
Experimental Biology, 195, 91–108. & Garcia-Sanchez, J. L. (1983). Blocking of inhibitory
Smith, J. C., & Roll, D. L. (1967). Trace conditioning conditioning within a serial conditioned stimulus-
with x-rays as an aversive stimulus. Psychonomic conditioned inhibitor compound: Maintenance of
Science, 9, 11–12. acquired behavior without an unconditioned stimu-
Smith, J. D., Beran, M. J., & Couchman, J. J. (2012). An- lus. Learning and Motivation, 14, 1–29.
imal metacognition. In T. Zentall & E A. Wasserman Sovrano, V. A., Bisazza, A., & Vallortigara, G. (2003).
(Eds), The Oxford handbook of comparative cognition Modularity as a fish (Xenotoca eiseni) views it: Con-
(pp. 282–304). New York: Oxford University Press. joining geometric and nongeometric information for
Smith, J. D., Couchman, J. J., & Beran, M. J. (2014). spatial reorientation. Journal of Experimental Psychol-
Animal metacognition: A tale of two comparative ogy: Animal Behavior Processes, 29, 199–210.
psychologies. Journal of Comparative Psychology, 128, Spear, N. E. (1978). The processing of memories: Forget-
115–131. ting and retention. Hillsdale, NJ: Lawrence Erlbaum
Smith, J. D., Beran, M. J., Redford, J. S., & Washburn, Associates, Inc.
D. A. (2006). Dissociating uncertainty states and Spear, N. E., & Parsons, P. J. (1976). Analysis of a
reinforcement signals in the comparative study of reactivation treatment: Ontogenetic determinants of
metacognition. Journal of Experimental Psychology: alleviated forgetting. In D. C. Medin, W. A. Roberts,
General, 135, 282–297. & R. T. Davis (Eds.), Processes of animal memory (pp.
Smith, J. D., Coutinho, M. V. C., Church, B. A., & 135–166). Hillsdale, NJ: Lawrence Erlbaum Associ-
Beran, M. J. (2013). Executive-attentional uncertainty ates, Inc.
responses by rhesus macaques (Macaca mulatta). Spence, K. W. (1936). The nature of discrimina-
Journal of Experimental Psychology: General, 142, tion learning in animals. Psychological Review, 43,
458–475. 427–449.
Smith, J. D., Schull, J., Strote, J., McGee, K., Egnor, Spence, K. W. (1947). The role of secondary reinforce-
R., & Erb, L. (1995). The uncertain response in the ment in delayed reward learning. Psychological
bottlenosed dolphin (Tursiops truncatus). Journal of Review, 54, 1–8.
Experimental Psychology: General, 124, 391–408. Spence, K. W. (1951). Theoretical interpretations of
Smith, M. C. (1968). CS-US interval and US intensity learning. In S. S. Stevens (Ed.), Handbook of experi-
in classical conditioning of the rabbit’s nictitating mental psychology (pp. 690–729). New York: Wiley.
membrane response. Journal of Comparative and Spence, K. W. (1956). Behavior theory and conditioning.
Physiological Psychology, 66, 679–687. New Haven, CT: Yale University Press.
Smith, M. C., Coleman, S. R., & Gormezano, I. (1969). Spetch, M. L. (1995). Overshadowing in landmark
Classical conditioning of the rabbit’s nictitating learning: Touch-screen studies with pigeons and
membrane response at backward, simultaneous, and humans. Journal of Experimental Psychology: Animal
forward CS-US intervals. Journal of Comparative and Behavior Processes, 21, 166–181.
Physiological Psychology, 69, 226–231. Spetch, M. L., & Friedman, A. (2003). Recogniz-
Smith, S. M. (1979). Remembering in and out of ing rotated views of objects: Interpolation versus
context. Journal of Experimental Psychology: Human generalization by humans and pigeons. Psychonomic
Learning and Memory, 5, 460–471. Bulletin & Review, 10, 135–140.
Smith, S. M. (1988). Environmental context-dependent Spetch, M. L., & Weisman, R. G. (2012). Birds’ percep-
memory. In G. M. Davies & D. M. Thomson (Eds.), tion of depth and objects in pictures. In O. F. Laza-
Memory in context: Context in memory (pp. 13–34). reva, T. Shimizu, & E. A. Wasserman (Eds.), How
Chilchester, UK: J. Wiley. animals see the world: Comparative behavior, biology, and
Smith, S. M., & Vela, E. (2001). Environmental context- evolution of vision (pp. 217–232). New York: Oxford
dependent memory: A review and meta-analysis. University Press.
Psychonomic Bulletin & Review, 8, 203–220. Spetch, M. L., Cheng, K., & MacDonald, S. E. (1996).
Soeter, M., & Kindt, M. (2012). Stimulation of the Learning the configuration of a landmark array: I.
noradrenergic system during memory formation Touch-screen studies with pigeons and humans.
impairs extinction learning but not the disruption Journal of Comparative Psychology, 110, 55–68.
of reconsolidation. Neuropsychopharmacology, 37, Spetch, M. L., Cheng, K., MacDonald, S. E., Linken-
1204–1215. hoker, B. A., Kelly, D. M., & Doerkson, S. R. (1997).
Solomon, R. L. (1980). The opponent-process theory of Use of landmark configuration in pigeons and
acquired motivation: The costs of pleasure and the humans: II. Generality across search tasks. Journal of
benefits of pain. American Psychologist, 35, 691–712. Comparative Psychology, 111, 14–24.
Solomon, R. L., & Corbit, J. D. (1974). An opponent- Spetch, M. L., Rust, T. B., Kamil, A. C., & Jones, J. E.
process theory of motivation: I. Temporal dynamics (2003). Search by rules: Pigeons’ (Columbia livia)
of affect. Psychological Review, 81, 119–145. landmark-based search according to constant
Solomon, R. L., & Turner, L. H. (1962). Discriminative bearing or constant distance. Journal of Comparative
classical conditioning in dogs paralyzed by curare Psychology, 117, 123–132.
can later control discriminative avoidance responses Spreat, S., & Spreat, S. R. (1982). Learning principles.
in the normal state. Psychological Review, 69, 202–218. In V. Voith & P. L. Borchelt (Eds.), Veterinary clinics
522  References

of North America: Small animal practice (pp. 593–606). 129–151). Hillsdale, NJ: Lawrence Erlbaum Associ-
Philadelphia, PA: W. B. Saunders. ates, Inc.
Squire, L. R. (1987). Memory and brain. New York: Stewart, J., de Wit, H., & Eikelboom, R. (1984). Role
Oxford University Press. of unconditioned and conditioned drug effects in
St. Claire-Smith, R. (1979). The overshadowing and the self-administration of opiates and stimulants.
blocking of punishment. Quarterly Journal of Experi- Psychological Review, 91, 251–268.
mental Psychology, 31B, 51–61. Stolerman, I. P. (1992). Drugs of abuse: Behavioral
Staddon, J. E. (1979). Operant behavior as adapta- principles, methods and terms. Trends in Pharmaco-
tion to constraint. Journal of Experimental Psychology: logical Sciences, 13, 170–176.
General, 108, 48–67. Stout, S. C., & Miller, R. R. (2007). Sometimes-
Staddon, J. E. (1983). Adaptive behavior and learning. competing retrieval (SOCR): A formalization of the
New York: Cambridge University Press. comparator hypothesis. Psychological Review, 114,
Staddon, J. E. R. (1977). Schedule-induced behavior. 759–783.
In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of Suarez, S. D., & Gallup, G. G. (1981). Predatory
operant behavior (pp. 125–152). Englewood Cliffs, NJ: overtones of open-field testing in chickens. Animal
Prentice-Hall. Learning & Behavior, 9, 153–163.
Staddon, J. E. R., & Ayres, S. L. (1975). Sequential and Sunsay, C., Stetson, L., & Bouton, M. E. (2004).
temporal properties of behavior induced by a sched- Memory priming and trial spacing effects in Pavlov-
ule of periodic food delivery. Behaviour, 54, 26–49. ian learning. Learning & Behavior, 32, 220–229.
Staddon, J. E. R., & Higa, J. J. (1999). Time and Sutherland, R. J., Chew, G. L., Baker, J. L., & Linggard,
memory: Towards a pacemaker-free theory of R. C. (1987). Some limitations on the use of distal
interval timing. Journal of the Experimental Analysis of cues in place navigation by rats. Psychobiology, 15,
Behavior, 72, 225–252. 48–57.
Staddon, J. E. R., & Simmelhag, V. L. (1971). The Sutton, R. S., & Barto, A. G. (1981). Toward a modern
“superstition” experiment: A reexamination of its theory of adaptive networks: Expectation and pre-
implications for the principle of adaptive behavior. diction. Psychological Review, 88, 135–170.
Psychological Review, 78, 3–43. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learn-
Stampfl, T. G., & Levis, D. J. (1967). Essentials of ing: An introduction. Cambridge, MA: MIT Press.
implosive therapy: A learning-theory-based psy- Suzuki, A., Josselyn, S. A., Frankland, P. W., Masush-
chodynamic behavioral therapy. Journal of Abnormal ige, S., Silva, A. J., & Kida, S. (2004). Memory recon-
Psychology, 72, 496–503. solidation and extinction have distinct temporal and
Starr, M. D. (1978). An opponent-process theory of biochemical signatures. Journal of Neuroscience, 24,
motivation: VI. Time and intensity variables in the 4787–4795.
development of separation-induced distress calling Suzuki, S., Augerinos, G., & Black, A. H. (1980).
in ducklings. Journal of Experimental Psychology: Stimulus control of spatial behavior on the eight-
Animal Behavior Processes, 4, 338–355. arm maze in rats. Learning and Motivation, 11, 1–18.
Starr, M. D., & Mineka, S. (1977). Determinants of fear Swartzentruber, D. (1995). Modulatory mechanisms in
over the course of avoidance learning. Learning and Pavlovian conditioning. Animal Learning & Behavior,
Motivation, 8, 332–350. 23, 123–143.
Stein, J. S., Renda, C. R., Hinnenkamp, J. E., & Mad- Swartzentruber, D., & Bouton, M. E. (1986). Contex-
den, G. J. (2015). Impulsive choice, alcohol con- tual control of negative transfer produced by prior
sumption, and pre-exposure to delayed rewards: CS-US pairings. Learning and Motivation, 17, 366–385.
II. Potential mechanisms. Journal of the Experimental Swartzentruber, D., & Rescorla, R. A. (1994). Modula-
Analysis of Behavior, 103, 33–49. tion of trained and extinguished stimuli by facilita-
Steinmetz, J. E. (1996). The brain substrates of classical tors and inhibitors. Animal Learning & Behavior, 22,
eyeblink conditioning in rabbits. In J. Bloedel, T. Eb- 309–316.
ner, & S. Wise (Eds.), Acquisition of motor behavior in Symonds, M., Hall, G., & Bailey, G. K. (2002). Percep-
vertebrates (pp. 89–114). Cambridge, MA: MIT Press. tual learning with a sodium depletion procedure.
Steinmetz, J. E., Gluck, M. A., & Solomon, P. R. Journal of Experimental Psychology: Animal Behavior
(Eds.). (2001). Model systems and the neurobiology of Processes, 28, 190–199.
associative learning. Hillsdale, NJ: Lawrence Erlbaum T
Associates, Inc. Tait, R. W., & Saladin, M. E. (1986). Concurrent de-
Stevenson-Hinde, J. (1973). Constraints on reinforce- velopment of excitatory and inhibitory associations
ment. In R. A. Hinde & J. Stevenson-Hinde (Eds.), during backward conditioning. Animal Learning &
Constraints on learning: Limitations and predispositions. Behavior, 14, 133–137.
New York: Academic Press. Tangen, J. M., & Allan, L. G. (2003). Cue interaction
Stewart, J. (1992). Conditioned stimulus control of and judgments of causality: Contributions of causal
the expression of sensitization of the behavioral and associative processes. Memory and Cognition, 32:
activating effects of opiate and stimulant drugs. In I. 107–24.
Gormezano & E. A. Wasserman (Eds.), Learning and
memory: The behavioral and biological substrates (pp.
References  523

Tarpy, R. M., & Sawabini, F. L. (1974). Reinforcement subsidies to improve diets: Understanding the
delay: A selective review of the last decade. Psycho- recent evidence. Nutrition Reviews, 72, 551–565.
logical Bulletin, 81, 984–997. Thrailkill, E. A., & Bouton, M. E. (2015). Contextual
Temple, J. L., Giacomelli, A. M., Roemmich, J. N., control of instrumental actions and habits. Journal of
& Epstein, L. H. (2008). Habituation and within- Experimental Psychology: Animal Learning and Cogni-
session changes in motivated responding for food in tion, 41, 69–80.
children. Appetite, 50, 390–396. Thrailkill, E. C., & Bouton, M. E. (2015). Extinction of
Terrace, H. S. (1963). Errorless transfer of a discrimi- chained instrumental behaviors: Effects of procure-
nation across two continua. Journal of the Experimen- ment extinction on consumption responding. Journal
tal Analysis of Behavior, 6, 223–232. of Experimental Psychology: Animal Learning and
Terry, W. S. (1976). Effects of priming unconditioned Cognition, 41, 232–246.
stimulus representation in short-term memory on Tiffany, S. T., Drobes, D. J., & Cepeda-Benito, A.
Pavlovian conditioning. Journal of Experimental Psy- (1992). Contribution of associative and nonassocia-
chology: Animal Behavior Processes, 2, 354–369. tive processes to the development of morphine
Testa, T. J., Juraska, J. M., & Maier, S. F. (1974). Prior tolerance. Psychopharmacology, 109, 185–190.
exposure to inescapable electric shock in rats affects Timberlake, W. (1980). An equilibrium theory of
extinction behavior after the successful acquisition learned performance. In G. H. Bower (Ed.), Psychol-
of an escape response. Learning and Motivation, 5, ogy of learning and motivation (Vol. 14, pp. 1–58). New
380–392. York: Academic Press.
Thaler, R. H., & Sunstein, C. R. (2008). Nudge: Improv- Timberlake, W. (1983). The functional organization of
ing decisions about health, wealth, and happiness. New appetitive behavior: Behavior systems and learning.
Haven: Yale University Press. In M. D. Zeiler & P. Harzen (Eds.), Advances in the
Thewissen, R., Snijders, S. J. B. D., Havermans, R. analysis of behavior, Vol. 3 (pp. 77–221). Chichester,
C., van den Hout, M., & Jansen, A. (2006). Renewal UK: J. Wiley.
of cue-elicited urge to smoke: Implications for cue Timberlake, W. (1984). Behavior regulation and
exposure treatment. Behaviour Research and Therapy, learned performance: Some misapprehensions and
44, 1441–1449. disagreements. Journal of the Experimental Analysis of
Thomas, D. A. (1979). Retention of conditioned inhibi- Behavior, 41, 355–375.
tion in a bar-press suppression paradigm. Learning Timberlake, W. (1994). Behavior systems, associa-
and Motivation, 10, 161–177. tionism, and Pavlovian conditioning. Psychonomic
Thomas, D. R., & Barker, E. G. (1964). The effects Bulletin & Review, 1, 405–420.
of extinction and “central tendency” on stimulus Timberlake, W. (2001). Motivational modes in
generalization in pigeons. Psychonomic Science, 1, behavior systems. In R. R. Mowrer & S. B. Klein
119–121. (Eds.), Handbook of contemporary learning theories (pp.
Thomas, D. R., & Lopez, L. J. (1962). The effects of 155–210). Mahwah, NJ: Lawrence Erlbaum Associ-
delayed testing on generalization slope. Journal of ates, Inc.
Comparative and Physiological Psychology, 55, 541–544. Timberlake, W., & Allison, J. (1974). Response de-
Thomas, D. R., Mood, K., Morrison, S., & Wiertelak, privation: An empirical approach to instrumental
E. (1991). Peak shift revisited: A test of alternative performance. Psychological Review, 81, 146–164.
interpretations. Journal of Experimental Psychology: Timberlake, W., & Farmer-Dougan, V. A. (1991). Re-
Animal Behavior Processes, 17, 130–140. inforcement in applied settings: Figuring out ahead
Thomas, G. V., Robertson, D., & Lieberman, D. A. of time what will work. Psychological Bulletin, 110,
(1987). Marking effects in Pavlovian trace condi- 379–391.
tioning. Journal of Experimental Psychology: Animal Timberlake, W., & Grant, D. L. (1975). Auto-shaping
Behavior Processes, 13, 126–135. in rats to the presentation of another rat predicting
Thompson, R. F. (1986). The neurobiology of learning food. Science, 190, 690–692.
and memory. Science, 233, 941–947. Timberlake, W., & Lucas, G. A. (1985). The basis of su-
Thompson, R. F., & Krupa, D. J. (1994). Organization perstitious behavior: Chance contingency, stimulus
of memory traces in the mammalian brain. Annual substitution, or appetitive behavior? Journal of the
Review of Neuroscience, 17, 519–549. Experimental Analysis of Behavior, 44, 279–299.
Thompson, R. F., & Spencer, W. A. (1966). Habitua- Timberlake, W., & Lucas, G. A. (1991). Periodic water,
tion: A model phenomenon for the study of neuro- interwater interval, and adjunctive behavior in a 24-
nal substrates of behavior. Psychological Review, 73, hour multi-response environment. Animal Learning
16–43. & Behavior, 19, 369–380.
Thompson, R. F., & Steinmetz, J. E. (2009). The role of Timberlake, W., & Silva, K. M. (1995). Appetitive
the cerebellum in classical conditioning of discrete behavior in ethology, psychology, and behavioral
behavioral responses. Neuroscience, 162, 732–755. systems. In N. S. Thompson (Ed.), Perspectives in
Thorndike, E. L. (1911). Animal intelligence: Experimen- ethology, Vol. 11: Behavioral design (pp. 211–253). New
tal studies. New York: Macmillan. York: Plenum Press.
Thow, A. M., Downs, S., & Jan, S. (2014). A system- Timberlake, W., Wahl, G., & King, D. (1982). Stimulus
atic review of the effectiveness of food taxes and and response contingencies in the misbehavior of
524  References

rats. Journal of Experimental Psychology: Animal Behav- Treisman, A. (1988). Features and objects: The four-
ior Processes, 8, 62–85. teenth Bartlett Memorial lecture. Quarterly Journal of
Tinbergen, L. (1960). The natural control of insects Experimental Psychology, 12, 97–136.
in pine woods: I. Facts influencing the intensity of Tricomi, E., Balleine, B. W., & O’Doherty, J. P. (2009).
predation by songbirds. Archives Neerlandaises de A specific role for posterior dorsolateral striatum in
Zoologie, 17, 2–11. human habit learning. European Journal of Neurosci-
Tinbergen, N. (1951). The study of instinct. London: ence, 29, 2225–2232.
Oxford University. Trivers, R. L. (1972). Parental investment and sexual
Tinbergen, N. (1963). The shell menace. Natural His- selection. In B. Campbell (Ed.), Sexual selection and
tory, 72, 28–35. the descent of man. Chicago, IL: Aldine.
Tinbergen, N., & Perdeck, A. C. (1950). On the stimu- Tryon, R. C. (1942). Individual differences. In F. A.
lus situation releasing the begging response in the Moss (Ed.), Comparative psychology. Englewood
newly hatched Herring Gull chick (Larus argentatus Cliffs, NJ: Prentice-Hall.
argentatus Pont.). Behaviour, 3, 1–39. Tulving, E. (1972). Episodic and semantic memory. In
Tinklepaugh, O. L. (1928). An experimental study of E. Tulving & W. Donaldson (Eds.), Organization of
representative factors in monkeys. Journal of Com- memory (pp. 381–403). New York: Academic Press.
parative Psychology, 8, 197–236. U
Tolman, E. C. (1932). Purposive Behavior in Animals and Urcelay, G. P., & Miller, R. R. (2006). Counteraction
Men. New York: The Century Co. between overshadowing and degraded contingency
Tolman, E. C. (1938). The determiners of behavior at a treatments: support for the extended comparator hy-
choice point. Psychological Review, 45, 1–41. pothesis. Journal of Experimental Psychology: Animal
Tolman, E. C. (1945). A stimulus-expectancy need- Behavior Processes, 32, 21–32.
cathexis psychology. Science, 101, 160–166. Urcelay, G. P., & Miller, R. R. (2009). Potentiation and
Tolman, E. C. (1948). Cognitive maps in rats and men. overshadowing in Pavlovian fear conditioning.
Psychological Review, 55, 189–208. Journal of Experimental Psychology: Animal Behavior
Tolman, E. C. (1949). There is more than one kind of Processes, 35, 340–356.
learning. Psychological Review, 56, 144–155. Urcuioli, P. J. (2001). Categorization and acquired
Tolman, E. C., & Honzik, C. H. (1930). Introduction equivalence. In R. G. Cook (Ed.), Avian visual cogni-
and removal of reward, and maze performance in tion [On-line]. Available: www.pigeon.psy.tufts.edu/
rats. University of California Publications in Psychology, avc/urcuioli/
4, 257–275. Urcuioli, P. J., & Zentall, T. R. (1986). Retrospective
Tolman, E. C., Ritchie, B. F., & Kalish, D. (1946a). Stud- coding in pigeons’ delayed matching-to-sample.
ies in spatial learning. I. Orientation and the short- Journal of Experimental Psychology: Animal Behavior
cut. Journal of Experimental Psychology, 36, 13–24. Processes, 12, 69–77.
Tolman, E. C., Ritchie, B. F., & Kalish, D. (1946b). Urcuioli, P. J., Zentall, T. R., Jackson-Smith, P., &
Studies in spatial learning. II. Place learning versus Steirn, J. N. (1989). Evidence for common coding in
response learning. Journal of Experimental Psychology, many-to-one matching: Retention, intertrial interfer-
36, 221–229. ence, and transfer. Journal of Experimental Psychology:
Tomarken, A. J., Mineka, S., & Cook, M. (1989). Fear- Animal Behavior Processes, 15, 264–273.
relevant selective associations and covariation bias. Urushihara, K., Stout, S. C., & Miller, R. R. (2004). The
Journal of Abnormal Psychology, 98, 381–394. basic laws of conditioning differ for elemental cues
Tomie, A. (1996). Locating reward cue at response and cues trained in compound. Psychological Science,
manipulandum (CAM) induces symptoms of drug 15, 268–271.
abuse. Neuroscience and biobehavioral reviews, 20,
505–535. V
Tomie, A. (2001). Autoshaping and drug-taking. In R. Valentin, W., Dickinson, A., & O’Doherty, J. P. (2007).
R. Mowrer & S. B. Klein (Eds.), Handbook of contem- Determining the neural substrates of goal-directed
porary learning theories (pp. 409–440). Mahwah, NJ: learning in the human brain. Journal of Neuroscience,
Lawrence Erlbaum Associates, Inc. 27, 4019–4026.
Trapold, M. A. (1970). Are expectancies based upon Van Hamme, L. J., & Wasserman, E. A. (1994). Cue
different positive reinforcing events discriminably competition in causality judgments: The role of
different? Learning and Motivation, 1, 129–140. nonpresentation of compound stimulus elements.
Trapold, M. A., & Overmier, J. B. (1972). The second Learning and Motivation, 25, 127–151.
learning process in instrumental learning. In A. H. Van Hamme, L. J., Wasserman, E. A., & Biederman, I.
Black & W. F. Prokasy (Eds.), Classical conditioning: (1992). Discrimination of contour-deleted images by
Vol. 2. Current research and theory (pp. 427–452). New pigeons. Journal of Experimental Psychology: Animal
York: Appleton-Century-Crofts. Behavior Processes, 18, 387–399.
Trask, S., & Bouton, M. E. (2014). Contextual control of Vaughan, W., & Greene, S. L. (1984). Pigeon visual
operant behavior: Evidence for hierarchical associa- memory capacity. Journal of Experimental Psychology:
tions in instrumental learning. Learning & Behavior, Animal Behavior Processes, 10, 256–271.
42, 281–288.
References  525

Vervliet, B., Craske, M. G., & Hermans, D. (2013). Wagner, A. R., & Brandon, S. E. (1989). Evolution
Fear extinction and relapse: State of the art. Annual of a structured connectionist model of Pavlovian
Review of Clinical Psychology, 9, 215–248. conditioning (AESOP). In S. B. Klein & R. R. Mowrer
Vila, J. C. (1989). Protection from pentobarbital lethal- (Eds.), Contemporary learning theories: Pavlovian
ity mediated by Pavlovian conditioning. Pharmacol- conditioning and the status of traditional learning theory
ogy, Biochemistry and Behavior, 32, 365–366. (pp. 149–189). Hillsdale, NJ: Lawrence Erlbaum As-
Visintainer, M. A., Volpicelli, J. R., & Seligman, M. E. sociates, Inc.
(1982). Tumor rejection in rats after inescapable or Wagner, A. R., & Brandon, S. E. (2001). A componen-
escapable shock. Science, 216, 437–439. tial theory of Pavlovian conditioning. In R. R. Mow-
von Fersen, L., & Lea, S. E. G. (1990). Category dis- rer & S. B. Klein (Eds.), Handbook of contemporary
crimination by pigeons using five polymorphous learning theories. (pp. 23–64). Hillsdale, NJ: Lawrence
features. Journal of the Experimental Analysis of Behav- Erlbaum Associates, Inc.
ior, 54, 69–84. Wagner, A. R., & Rescorla, R. A. (1972). Inhibition in
Vos, D. R., Prijs, J., & Cate, C. T. (1993). Sexual im- Pavlovian conditioning: Application of a theory. In
printing in zebra finch males: A differential effect of M. S. Halliday & R. A. Boakes (Eds.), Inhibition and
successive and simultaneous experience with two learning (pp. 301–336). New York: Academic Press.
colour morphs. Behaviour, 126, 137–154. Wagner, A. R., Logan, F. A., Haberlandt, K., & Price,
Vreven, D., & Blough, P. M. (1998). Searching for one T. (1968). Stimulus selection in animal discrimina-
or many targets: Effects of extended experience on tion learning. Journal of Experimental Psychology, 76,
the runs advantage. Journal of Experimental Psychol- 177–186.
ogy: Animal Behavior Processes, 24. Wagner, A. R., Rudy, J. W., & Whitlow, J. W. (1973).
Vurbic, D., & Bouton, M. E. (2014). A contemporary Rehearsal in animal conditioning. Journal of Experi-
behavioral perspective on extinction. In F. K. Mc- mental Psychology, 97, 407–426.
Sweeney & E. S. Murphy (Eds.), The Wiley-Blackwell Waldmann, M. R. (1996). Knowledge-based causal
handbook of operant and classical conditioning (pp. induction. In D. R. Shanks, K. J. Holyoak, & D. L.
53–76). Chichester, UK: John Wiley & Sons, Ltd. Medin (Eds.), The psychology of learning and motiva-
W tion, Vol. 34: Causal learning (pp. 47–88). San Diego,
Wagner, A. R. (1959). The role of reinforcement and CA: Academic Press.
nonreinforcement in an “apparent frustration ef- Waldmann, M. R. (2000). Competition among causes
fect.” Journal of Experimental Psychology, 57, 130–136. but not effects in predictive and diagnostic learning.
Wagner, A. R. (1961). Effects of amount and percent- Journal of Experimental Psychology: Learning, Memory,
age of reinforcement and number of acquisition and Cognition, 26, 53–76.
trials on conditioning and extinction. Journal of Waldmann, M. R., & Holyoak, K. J. (1992). Predic-
Experimental Psychology, 62, 234–242. tive and diagnostic learning within causal models:
Wagner, A. R. (1971). Elementary associations. In H. Asymmetries in cue competition. Journal of Experi-
H. Kendler & J. T. Spence (Eds.), Essays in neobehav- mental Psychology: General, 121, 222–236.
iorism: A memorial volume to Kenneth W. Spence (pp. Wall, P. L., Botly, L. C. P., Black, C. K., & Shettleworth,
187–213). New York: Appleton-Century-Crofts. S. J. (2004). The geometric module in the rat: Inde-
Wagner, A. R. (1976). Priming in STM: An informa- pendence of shape and feature learning in a food
tion-processing mechanism for self-generated or finding task. Learning & Behavior, 32, 289–298.
retrieval-generated depression in performance. In T. Walz, N., Mühlberger, A., & Pauli, P. (2016). A hu-
J. Tighe & R. N. Leaton (Eds.), Habituation: Perspec- man open field test reveals thigmotaxis related to
tives from child development, animal behavior and agoraphobic fear. Biological Psychiatry. (Epub ahead
neurophysiology (pp. 95–128). Hillsdale, NJ: Lawrence of print)
Erlbaum Associates, Inc. Wang, S-H., de Oliveira Alvares, L., & Nader, K.
Wagner, A. R. (1978). Expectancies and the priming (2009). Cellular and systems mechanisms of memory
of STM. In S. H. Hulse, H. Fowler, & W. K. Honig strength as a constraint on auditory fear reconsoli-
(Eds.), Cognitive processes in animal behavior (pp. dation. Nature Neuroscience, 12, 905–912.
177–209). Hillsdale, NJ: Lawrence Erlbaum Associ- Warden, C. J. (1931). Animal motivation: Experimental
ates, Inc. studies on the albino rat. New York: Columbia Univer-
Wagner, A. R. (1981). SOP: A model of automatic sity Press.
memory processing in animal behavior. In N. E. Wasserman, E. A. (1973). Pavlovian conditioning
Spear & R. R. Miller (Eds.), Information processing in with heat reinforcement produces stimulus-directed
animals: Memory mechanisms (pp. 5–47). Hillsdale, pecking in chicks. Science, 181, 875–877.
NJ: Lawrence Erlbaum Associates, Inc. Wasserman, E. A. (1974). Stimulus-reinforcer pre-
Wagner, A. R. (2003). Context-sensitive elemental dictiveness and selective discrimination learning
theory. Quarterly Journal of Experimental Psychology B: in pigeons. Journal of Experimental Psychology, 103,
Comparative and Physiological Psychology, 56B, 7–29. 284–297.
Wagner, A. R. (2008). Evolution of an elemental theory Wasserman, E. A. (1990). Attribution of causality to
of Pavlovian conditioning. Learning & Behavior, 36, common and distinctive elements of compound
253–265. stimuli. Psychological Science, 1, 298–302.
526  References

Wasserman, E. A. (1995). The conceptual abilities of Watson, P., Wiers, R. W., Hommel, B., & de Wit, S.
pigeons. American Scientist, 83, 246–255. (2014). Working for food you don’t desire. Cues
Wasserman, E. A., & Bhatt, R. S. (1992). Conceptual- interfere with goal-directed food-seeking. Appetite,
ization of natural and artificial stimuli by pigeons. 79, 139–148.
In W.K. Honig & J. G. Fetterman (Eds.), Cognitive Wearden, J. H., & Doherty, M. F. (1995). Exploring and
aspects of stimulus control (pp. 203–223). Hillsdale, NJ: developing a connectionist model of animal timing:
Lawrence Erlbaum, Inc. Peak procedure and fixed-interval simulations.
Wasserman, E. A., & Biederman, I. (2012). Recog- Journal of Experimental Psychology: Animal Behavior
nition-by-components: A bird’s eye view. In O. F. Processes, 21, 99–115.
Lazareva, T. Shimizu, & E. A. Wasserman (Eds.), Weary, D. M., Guilford, T. C., & Weisman, R. G. (1992).
How animals see the world: Comparative behavior, biol- A product of discriminative learning may lead to
ogy, and evolution of vision (pp. 191–216). New York: female preferences for elaborate males. Evolution, 47,
Oxford University Press. 333–336.
Wasserman, E. A., Brooks, D. I., & McMurray, B. Weidemann, G., & Kehoe, E. J. (2005). Stimulus speci-
(2015). Pigeons acquire multiple categories in paral- ficity of concurrent recovery in the rabbit nictitat-
lel via associative learning: A parallel to human ing membrane response. Learning & Behavior, 33,
word learning? Cognition, 136, 99–122. 343–362.
Wasserman, E. A., DeVolder, C. L., & Coppage, D. J. Weidemann, G., Tangen, J. M., Lovibond, P. F., &
(1992). Non-similarity-based conceptualization in Mitchell, C. J. (2009). Is Perruchet’s dissociation be-
pigeons via secondary or mediated generalization. tween eyelid conditioned responding and outcome
Psychological Science, 3, 374–379. expectancy evidence for two learning systems?
Wasserman, E. A., Elek, S. M., Chatlosh, D. L., & Journal of Experimental Psychology: Animal Behavior
Baker, A. G. (1993). Rating causal relations: Role of Processes, 35, 169–176.
probability in judgments of response-outcome con- Weingarten, H. P. (1983). Conditioned cues elicit
tingency. Journal of Experimental Psychology: Learning, feeding in sated rats: A role for learning in meal
Memory, and Cognition, 19, 174–188. initiation. Science, 220, 431–433.
Wasserman, E. A., Frank, A. J., & Young, M. E. (2002). Weingarten, H. P. (1984). Meal initiation controlled by
Stimulus control by same-versus-different relations learned cues: Basic behavioral properties. Appetite,
among multiple visual stimuli. Journal of Experimen- 5, 147–158.
tal Psychology: Animal Behavior Processes, 28, 347–357. Weingarten, H. P. (1985). Stimulus control of eating:
Wasserman, E. A., Hugart, J. A., & Kirkpatrick-Steger, Implications for a two-factor theory of hunger. Ap-
K. (1995). Pigeons show same–different conceptu- petite, 6, 387–401.
alization after training with complex visual stimuli. Weingarten, H. P. (1990). Learning, homeostasis, and
Journal of Experimental Psychology: Animal Behavior the control of feeding behavior. In E. D. Capaldi &
Processes, 21, 248–252. T. L. Powley (Eds.), Taste, experience, and feeding (pp.
Wasserman, E. A., Kiedinger, R. E., & Bhatt, R. S. 14–27). Washington DC: American Psychological
(1988). Conceptual behavior in pigeons: Categories, Association.
subcategories, and pseudocategories. Journal of Weinstock, S. (1954). Resistance to extinction of a
Experimental Psychology: Animal Behavior Processes, running response following partial reinforcement
14, 235–246. under widely spaced trials. Journal of Comparative
Wasserman, E. A., Young, M. E., & Cook, R. G. (2004). and Physiological Psychology, 47, 318–322.
Variability discrimination in humans and animals: Weisman, R. G., & Litner, J. S. (1969). Positive condi-
Implications for adaptive action. American Psycholo- tioned reinforcement of Sidman avoidance behavior
gist, 59, 879–890. in rats. Journal of Comparative and Physiological Psy-
Wassum, K. M., Ostlund, S. B., Maidment, N. T., chology, 68, 597–603.
& Balleine, B. W. (2009). Distinct opioid circuits Weisman, R., Shackleton, S., Ratcliffe, L., Weary, D., &
determine the palatability and the desirability of Boag, P. T. (1994). Sexual preferences of female zebra
rewarding events. Proceedings of the National Academy finches: Imprinting on beak colour. Behaviour, 128,
of Science, 106, 12512–12517. 15–24.
Watanabe, S., Sakamoto, J., & Wakita, M. (1995). Weiss, J. (1968). The effects of coping responses on
Pigeons’ discrimination of painting by Monet and stress. Journal of Comparative and Physiological Psy-
Picasso. Journal of the Experimental Analysis of Behav- chology, 65, 251–260.
ior, 63, 165–174. Weiss, J. M., Goodman, P. A., Losito, B. G., Corrigan,
Watson, J. B. (1913). Psychology as the behaviourist S., Charry, J. M., & Bailey, W. H. (1981). Behavioral
views it. Psychological Review, 20, 158–177. depression produced by an uncontrollable stressor:
Watson, J. B. (1916). The place of the conditioned- Relationship to norepinephrine, dopamine, and
reflex in psychology. Psychological Review, 23, 89–116. serotonin levels in various regions of the rat brain.
Watson, J. B. (1924). Behaviorism. New York: Norton. Brain Research Review, 3, 167–205.
Watson, J. B., & Rayner, R. (1920). Conditioned emo- Weiss, S. J., & Weissman, R. D. (1992). Generalization
tional reactions. Journal of Experimental Psychology, peak shift for autoshaped and operant key pecks.
3, 1–14.
References  527

Journal of the Experimental Analysis of Behavior, 57, Williams, D. R. (1965). Classical conditioning and
27–143. incentive motivation. In W. F. Prokasy (Ed.), Classi-
Westbrook, R. F., & Bouton, M. E. (2010). Latent inhi- cal conditioning: A symposium. New York: Appleton
bition and extinction: Their signature phenomena Century Crofts.
and the role of prediction error. In R. E. Lubow & I. Williams, D. R., & Williams, H. (1969). Auto-main-
Weiner (Eds.), Latent inhibition: Cognition, neurosci- tenance in the pigeon: Sustained pecking despite
ence, and applications to schizophrenia (pp. 23–39). contingent non-reinforcement. Journal of Experimen-
Cambridge, UK: Cambridge University Press. tal and Analytical Behavior, 12, 511–520.
Westbrook, R. F., Jones, M. L., Bailey, G. K., & Harris, Williams, G. C. (1966). Adaptation and natural selection.
J. A. (2000). Contextual control over conditioned Princeton, NJ: Princeton University Press.
responding in a latent inhibition paradigm. Journal Williams, J. L., & Maier, S. F. (1977). Transitional im-
of Experimental Psychology: Animal Behavior Processes, munization and therapy of learned helplessness in
26, 157–173. the rat. Journal of Experimental Psychology: Animal
White, K. G., Parkinson, A. E., Brown, G. S., & Wixted, Behavior Processes, 3, 240–252.
J. T. (2004). Local proactive interference in delayed Williams, S. B. (1938). Resistance to extinction as a
matching to sample: The role of reinforcement. function of the number of reinforcements. Journal of
Journal of Experimental Psychology: Animal Behavior Experimental Psychology, 23, 506–522.
Processes, 30, 83–95. Wilson, E. O. (1975). Sociobiology: The new synthesis.
White, N. M., & McDonald, R. J. (2002). Multiple par- Cambridge, MA: Belknap Press.
allel memory systems in the brain of the rat. Neurobi- Wilson, P. N., Boumphrey, P., & Pearce, J. M. (1992).
ology of Learning and Memory, 77, 125–184. Restoration of the orienting response to a light by a
Whitehead, A. N. (1911). An introduction to mathemat- change in its predictive accuracy. Quarterly Journal of
ics. New York: Holt. Experimental Psychology B: Comparative and Physi-
Whitlow, J. W. (1975). Short-term memory in habitu- ological Psychology, 44B, 17–36.
ation and dishabituation. Journal of Experimental Witcher, E. S., & Ayres, J. J. (1984). A test of two meth-
Psychology: Animal Behavior Processes, 1, 189–206. ods for extinguishing Pavlovian conditioned inhibi-
Wiens, S., & Öhman, A. (2002). Unawareness is more tion. Animal Learning & Behavior, 12, 149–156.
than a chance event: Comment on Lovibond and Wolf, M. M., Risley, T., & Mees, H. (1964). Application
Shanks. Journal of Experimental Psychology: Animal of operant conditioning procedures to the behavior
Behavior Processes, 28, 27–31. problems of an autistic child. Behavior Research and
Wilcoxon, H. C., Dragoin, W. B., & Kral, P. A. (1971). Therapy, 1, 305–312.
Illness-induced aversions in rat and quail: Relative Wolpe, J. (1958). Psychotherapy by reciprocal inhibition.
salience of visual and gustatory cues. Science, 171, Stanford: Stanford University Press.
826–828. Woodbury, C. B. (1943). The learning of stimulus pat-
Williams, B. A. (1988). Reinforcement, choice, and re- terns by dogs. Journal of Comparative Psychology, 35,
sponse strength. In R. C. Atkinson & R. J. Herrnstein 29–40.
(Eds.), Stevens’ handbook of experimental psychology, Wood, N. E., Rosasco, M. L., Suris, A. M., Spring, J. D.,
Vol. 1: Perception and motivation; Vol. 2: Learning and Marin, M.-F., Lasko, N. B., Goetz, J. M., Fischer, A.
cognition (2nd ed., pp. 167–244). New York: Wiley. M., Orr, S. P., & Pitman, R. K. (2015). Pharmacologi-
Williams, B. A. (1989). The effect of response con- cal blockade of memory reconsolidation in posttrau-
tingency and reinforcement identity on response matic stress disorder: Three negative psychophysi-
suppression by alternative reinforcement. Learning ological studies. Psychiatry Research, 225, 31–39.
and Motivation, 20, 204–224. Wood, W., & Rünger, D. (2016). The psychology of
Williams, B. A. (1994a). Conditioned reinforcement: habit. Annual Review of Psychology, 67, 11–26.
Neglected or outmoded explanatory construct? Woods, S. C. (1991). The eating paradox: How we
Psychonomic Bulletin & Review, 1, 457–475. tolerate food. Psychological Review, 98, 488–505.
Williams, B. A. (1994b). Reinforcement and choice. In Woods, S. C. (2009). The control of food intake: behav-
N. J. Mackintosh (Ed.), Animal learning and cognition. ioral versus molecular perspectives. Cell Metabolism,
Handbook of perception and cognition series (2nd ed., 9, 489–498.
pp. 81–108). San Diego, CA: Academic Press. Woods, S. C., & Shogren, R. E. (1972). Glycemic
Williams, D. A. (1995). Forms of inhibition in animal responses following conditioning with different
and human learning. Journal of Experimental Psychol- doses of insulin in rats. Journal of Comparative and
ogy: Animal Behavior Processes, 21, 129–142. Physiological Psychology, 81, 220–225.
Williams, D. A., Overmier, J. B., & LoLordo, V. M. Woods, S. C., & Strubbe, J. H. (1994). The psychobi-
(1992). A reevaluation of Rescorla’s early dictums ology of meals. Psychonomic Bulletin & Review, 1,
about Pavlovian conditioned inhibition. Psychologi- 141–155.
cal Bulletin, 111, 275–290. Wray, J. M., Godleski, S. A., & Tiffany, S. T. (2011).
Williams, D. A., Sagness, K. E., & McPhee, J. E. (1994). Cue-reactivity in the natural environment of
Configural and elemental strategies in predictive cigarette smokers: The impact of photographic and
learning. Journal of Experimental Psychology: Learning, in vivo smoking stimuli. Psychology of Addictive
Memory, and Cognition, 20, 694–709. Behaviors, 25, 733–737.
528  References

Wyvell, C. L., & Berridge, K. C. (2000). Intra- Zentall, T. R. (1998). Symbolic representations in
accumbens amphetamine increases the conditioned animals: Emergent stimulus relations in conditional
incentive salience of sucrose reward: Enhancement discrimination learning. Animal Learning & Behavior,
of reward “wanting” without enhanced “liking” or 26, 363–377.
response reinforcement. Journal of Neuroscience, 20, Zentall, T. R. (2013). Animals represent the past and
8122–8130. the future. Evolutionary Psychology, 11, 573–590.
Y Zentall, T. R., & Wasserman, E. A. (Eds.) (2012). The
Young, A. M., & Herling, S. (1986). Drugs as reinforc- Oxford Handbook of comparative cognition. New York:
ers: Studies in laboratory animals. In S. R. Goldberg Oxford University Press.
& I. P. Stolerman (Eds.), Behavioral analysis of drug Zentall, T. R., Steirn, J. N., & Jackson-Smith, P. (1990).
dependence. Orlando, FL: Academic Press. Memory strategies in pigeons’ performance of a
Young, M. E., & Wasserman, E. A. (1997). Entropy radial-arm-maze analog task. Journal of Experimental
detection by pigeons: Response to mixed visual dis- Psychology: Animal Behavior Processes, 16, 358–371.
plays after same–different discrimination training. Zentall, T. R., Wasserman, E. A., & Urcuioli, P. J.
Journal of Experimental Psychology: Animal Behavior (2014). Associative concept learning in animals.
Processes, 23, 157–170. Journal of the Experimental Analysis of Behavior, 101,
Young, M. E., Wasserman, E. A., & Garner, K. L. 130–151.
(1997). Effects of number of items on the pigeon’s Zentall, T. R., Steirn, J. N., Sherburne, L. M., & Urcui-
discrimination of same from different visual oli, P. J. (1991). Common coding in pigeons assessed
displays. Journal of Experimental Psychology: Animal through partial versus total reversals of many-to-
Behavior Processes, 23, 491–501. one conditional and simple discriminations. Journal
of Experimental Psychology: Animal Behavior Processes,
Z 17, 194–201.
Zach, R. (1978). Selection and dropping of whelks Zimmer-Hart, C. L., & Rescorla, R. A. (1974). Extinc-
by Northwestern crows. Behaviour, 67(1-sup-2), tion of Pavlovian conditioned inhibition. Journal of
134–148. Comparative and Physiological Psychology, 86, 837–845.
Zach, R. (1979). Shell dropping: Decision-making and Zhou, W., & Crystal, J. D. (2009). Evidence for remem-
optimal foraging in northwestern crows. Behaviour, bering when events occurred in a rodent model of
68(1-sup-2), 106–117. episodic memory. Proceedings of the National Academy
Zahorik, D. M., & Maier, S. F. (1969). Appetitive con- of Sciences, 106, 9525–9529.
ditioning with recovery from thiamine deficiency as Zhou, W., Hohmann, A. G., & Crystal, J. D. (2012).
the unconditioned stimulus. Psychonomic Science, 17, Rats answer an unexpected question after incidental
309–310. encoding. Current Biology, 22, 1149–1153.
Zentall, T. R. (1997). Animal memory: The role of “in-
structions.” Learning and Motivation, 28, 280–308.
Author Index
A Asen, Y., 298 Barlow, D. H., 16, 66, 198, 199, 426
Abel, T., 177 Astley, S. L., 319 Barnes, J. M., 164
Ablan, C. D., 227 Atkinson, R. C., 23, 136 Barnet, R. C., 128
Abramson, L. Y., 433 Audrain-McGovern, J., 271 Bartlett, F. C., 176
Adams, C. D., 446, 451, 457 Augerinos, G., 329 Barto, A. G., 229, 458
Aeschliman, B., 459 Aust, U., 304 Basile, B. M., 358, 359
Aguado, L., 172 Austen, J. M., 353 Basoglu, M., 199
Ainslie, G., 268, 269 Austin, A., 134 Bass, T. D., 227
Akers, K. G., 350 Axelrod, S., 161 Batsell, W. R., 218, 219
Akil, H., 395 Aydin, A., 352, 384 Batsell, W. R., Jr., 218
Akins, C., 59 Ayres, J. J., 92, 128, 209 Batson, J. D., 218
Akins, C. K., 196, 197 Ayres, J. J. B., 418 Battalio, R. C., 263
Albert, M., 92 Ayres, S. L., 438, 439 Baum, W. M., 263, 266, 418, 452
Alcock, J., 43, 44 Azrin, N. H., 70 Baumeister, R. F., 271
Alexander, J. H., 433 B Bavieri, M., 90
Allan, L. G., 189, 238, 339 Baackes, M. P., 60, 189 Baxter, C., 270
Allen, T. A., 334 Babb, S. J., 334 Beatty, W. W., 330
Alling, K., 27 Barker, R., 277 Beck, H. P., 16
Allison, J., 272, 281, 282, 283, 284 Baddeley, A. D., 165 Becker, H. C., 227, 378
Allman, M. J., 343 Baerends, G. P., 194 Beckers, T., 176, 228, 233, 234, 239
Allswede, D. M.., 218 Baetu, I., 238 Been, S., 233
Almaraz, J., 238 Baeyens, F., 429 Beesley, T., 130, 135
Alonso, G., 98, 338, 339 Bailey, G. K., 172 Belli, R. F., 164
Amsel, A., 227, 380, 381, 385 Baker, J. C., 349 Bennett, C. H., 315, 316
Anderson, J. R., 26 Baker, A. G., 126, 128, 142, 222, 228, Beran, M. J., 358
Andrade, L. F., 272 235, 238, 240, 432 Berlau, D. J., 177
Andresen, G. V., 106 Baker, T. B., 404, 415 Berlie, J., 343
Andrew, B. J., 90 Balaz, M. A., 131 Berman, T. E., 177
Anger, D., 417 Balleine, B., 173, 369, 393, 457 Bernstein, I. L., 90, 96, 106, 207
Anisman, H., 433 Balleine, B. W., 27, 316, 368, 369, Berridge, K. C., 192, 395, 402, 404
Anker, J. J., 271 370, 391, 392, 404, 446, 447, 449, Best, M. R., 140, 145, 212, 213, 218
Annau, Z., 70, 94 450, 452, 453, 459, 461 Betts, S. L., 149, 187
Anokhin, K. V., 177 Balsam, P., 88, 92, 93, 127 Bevins, R. A., 33
Antoniadis, E., 344 Balsam, P. D., 88, 127, 287 Bhatt, R. S., 298, 299, 300, 301
Aparicio, J., 456 Barash, D., 43 Bickel, W. K., 270, 271, 272, 274
Arakaki, L., 227 Bardo, M. T., 33 Biebach, H., 335
Arcediano, F., 238 Barelare, B., Jr., 365 Biederman, I., 306, 321
Arnold, H. M., 370 Bargh, J. A., 133, 457 Biegler, R., 351
Artigas, A. A., 316 Barker, E. G., 312 Bindra, D., 441
530  Author index

Birch, L. L., 106, 373, 374 Brown, E. K., 358, 359 Cheng, P. W., 235, 237, 238
Birk, J., 56 Brown, E. R., 218 Chew, G. L., 349
Birmingham, K. M., 386 Brown, G. S., 328 Choi, J.-S., 427
Bisazza, A., 346 Brown, J. S., 186, 414 Christian, K. M., 86
Bitterman, M. E., 63, 103, 125, 223, Brown, M. F., 330, 347 Christoph, G., 63, 391
225, 226, 227, 312 Brown, M. L., 193 Chung, S.-H., 262
Bizo, L. A., 343 Brown, P. L., 63, 87, 440, 441 Church, B. A., 358
Black, A. H., 329, 388, 415 Brown, R. T., 385 Church, R. M., 67, 70, 337, 338, 340,
Blaisdell, A. P., 127, 301 Bruell, J., 58 341, 342, 397
Blanchard, D. C., 59, 424 Buhusi, C. V., 343 Churchill, M., 142
Blanchard, R. J., 59, 424 Bull, J. A., 394 Clark, D. M., 428
Blankenship, A. G., 218 Bumanglag, A. V., 227 Clark, R. E., 215
Blei, D. M., 459 Burgos, J. E., 287 Clayton, N. S., 332, 333, 334
Blesbois, E., 58 Burish, T. G., 90 Cobey, S. C., 227
Bleske, A. L., 220 Burke, J., 236 Cobos, P. L., 238
Blodgett, H. C., 350 Burkhardt, P. E., 92 Cochran, S. R., 218
Blough, D. S., 325, 326 Burstein, K. R., 308 Cohen, P. S., 439
Blough, P. M., 326 Bush, R. R., 117 Colagiuri, B., 64, 390
Boag, P. T., 311 Buss, D. M., 220 Colbert, M. M., 57
Boakes, R. A., 10, 11, 16, 64, 80, 89, Byron, K., 384 Coleman, K. J., 49
206, 236, 336, 440, 457 C Coleman, S. R., 92
Boggiano, M. M., 374 Cabanac, M., 370 Collier, G., 258, 367, 371, 372, 373,
Bolles, R. C., 8, 15, 20, 28, 50, 56, Cable, C., 296 439
59, 60, 72, 87, 92, 171, 195, 256, Cade, W. H., 46 Collins, B. N., 170
257, 277, 335, 336, 365, 366, 367, Cadieux, E. L., 57 Collison, C., 329
368, 377, 388, 391, 418, 419, 420, Caggiula, A. R., 48 Colwill, R. M., 28, 392, 394, 446,
421, 422, 423, 424, 425, 427, 428, Cain, C. K., 427 447, 451, 452, 453, 455, 456, 458
443, 444, 446 Cameron, J., 382, 383, 384 Conger, R., 262
Bombace, J. C., 187 Camp, D. A., 70 Conklin, C. A., 184
Bonardi, C., 186, 432 Camp, D. S., 69 Conway, C. C., 66
Bond, A. B., 42, 293, 294, 325 Campagnoni, F. R., 439 Conway, M. A., 149
Boneau, C. A., 308 Campbell, B. A., 163, 366 Cook, M., 74, 416, 434
Bossert, J. M., 170 Campbell, D. H., 56 Cook, R. G., 298, 301, 305, 306, 321,
Boumphrey, P., 134 Campos, A. C., 227 322, 323, 325, 326, 330, 331
Bouton, M. E., 16, 25, 48, 59, 62, 65, Caño, A., 238 Cook, S. W., 215, 239
66, 70, 87, 94, 101, 127, 139, 140, Cantor, M. B., 440 Cooke, A. M., 227
151, 153, 165, 166, 167, 168, 169, Capaldi, E. D., 386 Coon, D. J., 433
170, 171, 172, 184, 185, 186, 198, Capaldi, E. J., 56, 67, 370, 385, 386 Cooney, J. B., 385
199, 209, 215, 219, 233, 256, 387, Cappell, H., 61 Cooper, R. M., 46
391, 426, 428, 442, 456, 457 Carr, J. A. R., 335 Coppage, D. J., 319
Bower, G. H., 8, 69, 229 Carroll, M. E., 271 Corbit, J. D., 92, 396, 397, 398, 401
Bowman, M. T., 218 Carter, B. L., 395 Corbit, L. H., 391, 392, 393, 394,
Bradford, J. P., 56 Castro, L., 302, 303, 324, 325 453, 457
Brandon, S. E., 83, 144, 148, 149, Catania, A. C., 259, 260, 262, 264, Cornet, S., 342
150, 151, 152, 153, 186, 187, 191, 265, 284, 337, 451 Cosmides, L., 221
199, 391, 393 Cavoto, B. R., 323, 325 Costa, D. S. J., 206
Brandon, T. H., 170 Cavoto, K. K., 323 Couchman, J. J., 358
Breland, K., 72, 436, 438, 441 Cepeda-Benito, A., 404 Coutino, M. V. C., 358
Breland, M., 72, 436, 438, 441 Chamizo, V. D., 352 Coutureau, E., 391, 457
Brett, L. P., 216 Champoux, M., 434 Couvillon, P. A., 223, 224, 225, 226,
Brewer, K. E., 415 Chance, S., 222 227
Briggs, G. E., 164 Chang, P. W., 237 Craig, A. R., 264
Brimer, C. J., 415 Changizi, M. A., 370 Crandall, C., 56
Britton, G. B., 58 Channell, S., 47, 140, 142, 172 Craske, M. G., 66, 87, 122, 170
Broadbent, D. E., 136 Chapman, G. B., 231, 234, 236, 271 Crawford, L. L., 59
Broadbent, H. A., 341, 342 Chartrand, T. L., 133, 457 Crespi, L. P., 69, 377
Broberg, D. J., 106 Chase, H. W., 390, 393 Crews, D., 58
Brooks, D. C., 169 Chatlosh, D. L., 235 Crombag, H. S., 4, 170
Brooks, D. I., 298 Checke, S., 227 Crombez, G., 429
Brown, A. M., 127 Chen, S. X., 429 Crowell, C. R., 61
Brown, C., 134 Cheng, K., 311, 344, 345, 346, 353 Crystal, J. D., 334, 335, 342, 343
Author index   531

Crystal, J. D., 334, 335, 342, 343 Digdon, N., 16 Evenden, J., 228, 229
Cuell, S. F., 355 Dimberg, U., 73 Everitt, B. J., 33, 177, 256, 395, 404
Cullen, N., 218 Dinsmoor, J. A., 418 Eysenck, H. J., 16
Cullen, E., 43 Dodwell, P. C., 189 F
Culver, N. C., 122 Doherty, M. F., 342 Fairless, J. L., 97
Cunningham, C. L., 61, 169, 188, Domjan, M., 9, 27, 55, 58, 59, 71, 72, Falk, J. L., 439, 440
390, 395 73, 90, 195, 196, 199, 208 Falls, W. A., 187
Curley, K., 218 Donahoe, J. W., 83, 284, 287 Fanselow, M. S., 27, 56, 59, 60, 87,
Cutolo, P., 177 Dopson, J. C., 127, 135 153, 189, 195, 199, 216, 316, 317,
D Dorsey, J. R., 374 421, 423, 424, 425, 426
Daly, H. B., 64, 381, 382 Downs, S., 275 Fantino, E., 255
Daly, J. T., 382 Doyle, T. A., 439 Farber, I. E., 186
Daly, M., 271 Doyle-Burr, C., 153 Farmer-Dougan, V. A., 282
Daniel, T. O., 270 Dragoin, W. B., 72 Fedorchak, P. M., 56
Darby, R. J., 233, 236 Drobes, D. J., 404 Felsenberg, J., 226
Darling, F. M. C., 219 Dudai, Y., 177 Ferek, J. M., 177
Davey, G. C., 84 Duka, T., 134, 390 Fernando, A. B. P., 428
Davidson, T. L., 370, 456 Dukas, R., 325 Ferreira, M. A., 342
Davis, C. M., 365 Dumas, M. J., 58 Ferreira, T. P., 226, 227
Davis, M., 47, 186, 404 Dunham, P., 281 Ferster, C. B., 260
Davis, N. R., 121 Dunlap, C. M., 219 Fetterman, J. G., 295, 342
Davison, M., 263 Dunn, T., 443 Findley, J. D., 262
Daw, N. D., 459 Durlach, P. J., 127, 218, 219, 316 Finkelstein, E., 275
Dawkins, R., 42, 285, 286, 287 Duvarci, S., 176, 177 Fiore, N. C., 415
Dawson, G. R., 369 Dweck, C. S., 126 Fisher, E. B., 268
Dawson, M. E., 215 Dworkin, B. R., 193 Fisher, J. A., 373, 374
Dayan, P., 459 Dwyer, D. M., 236, 316 Flagel, S. B., 89, 395
Dearing, K. K., 275 E Flaherty, C. F., 227, 377, 378, 412
Dearing, M. F., 390 Ebner, D. L., 161 Fleshler, M., 160
DeCatanzaro, D., 433 Echiverri, A. M., 170 Folke, S., 448
Deci, E. L., 382, 383 Edberg, J. A., 347 Foltin, R. W., 273
Declercq, M., 430 Edhouse, W. V., 328 Forestell, C. A., 214
DeCola, J. P., 216, 434, 435 Edmunds, M., 50 Fortin, N. J., 334
DeGrandpre, R. J., 274 Egan, M., 271 Fowler, H., 26, 128
DeHouwer, J., 228, 233, 234, 238, Ehlers, A., 149 Frank, A. J., 323
239, 429, 430 Ehrman, R., 399 Frank, R., 222
Deich, J. D., 68, 287 Eibl-Eibesfeldt, I., 44, 45 Franklin, S. R., 64, 100
Delamater, A. R., 83, 90, 171, 233, Eikelboom, R., 33, 191, 192, 395 Fraser, J., 164
392, 453 Eimas, P. D., 135 Freed, D. E., 273
de la Mettrie, J., 6 Eisenberg, M., 177 Freeman, J. H., 27, 86
Delaney, L., 271 Eisenberger, R., 281, 382, 383, 384, Friedman, A., 306
de Lorge, J., 336 385 Friedman, B. X., 127
Deluty, M. Z., 338 Eisenhardt, D., 226 Frohardt, R., 233
Denniston, J. C., 127 Eiserer, L. A., 400, 401 Funayama, E. S., 227
de Oliveira Alvares, L., 176 Eissenberg, T., 189 Fussell, C., 352
Deptula, D., 378 Elek, S. M., 235 G
DeSpain, M. J., 330 Ellen, P., 349, 350 Galbicka, G., 53
Dess, N. K., 432, 434, 435 Elliott, M. H., 376 Gale, G. D., 160
Dess-Beech, N., 164 Ellison, G. D., 388 Galef, B. G., 375
de Villiers, P. A., 298 Engberg, L., 311 Gallistel, C. R., 93, 127, 341, 346,
DeVito, P. L., 128 Epstein, D. H., 53 349, 353
DeVolder, C. L., 319 Epstein, L. H., 48, 49, 270, 275 Gallup, G. G., 59
de Wit, H., 33, 395 Epstein, S., 397 Garb, J. L., 55
de Wit, S., 393 Ernst, A. J., 311 Garcia, J., 9, 55, 70, 71, 206, 207,
Dews, P. B., 338, 339 Ervin, F. R., 55, 206 213, 216, 218, 370
Dezfouli, A., 459 Esber, G. R., 134, 135, 355 Garner, C., 368, 449
Dickinson, A., 64, 68, 134, 221, 228, Escobar, M., 238 Garner, K. L., 324
229, 234, 236, 256, 332, 333, 334, Espinet, A., 316 Garrud, P., 349
368, 369, 370, 390, 394, 404, 428, Estes, W. K., 87, 248 Gartrell, K. E., 385
432, 446, 447, 448, 449, 450, 451, Estle, S. J., 267 Gawley, D. J., 440
452, 456, 457, 458 Etienne, A. S., 343 Gelder, M. G., 428
532  Author index

Gemberling, G. A., 9, 71, 72, 140, Grossen, N. E., 421, 428 Hinson, R. E., 61, 62
212, 213 Groves, P. M., 47, 48, 50, 95 Hinton, G. E., 232
Gentry, G. V., 312 Guilford, T. C., 311 Hirsch, E., 258, 372
Georgakopolous, J., 343 Gunnar, M., 434 Hirsch, J., 46
George, D. N., 88, 135 Guthrie, E. R., 247, 248 Hirsch, S. M., 50, 59, 420, 421, 422
Gersham, S. J., 459 Gutierrez, G., 196 Hoffman, H. S., 160, 399, 400, 401
Gharaei, S., 153 Guttman, N., 306 Hogarth, L., 134, 390, 393
Giacomelli, A. M., 49 H Hohmann, A. G., 335
Gibbon, J., 88, 93, 127, 337, 338, Hackmann, A., 149 Holland, P. C., 70, 83, 84, 87, 90, 93,
339, 340, 341, 343 Haddad, C., 92 127, 178, 181, 182, 183, 184, 185,
Gibbs, C. M., 384 Hagenaars, M. A., 426 186, 188, 200, 215, 216, 233, 391,
Gibson, B. M., 461 Hailman, J. P., 46 393, 452, 457, 458
Gibson, E. J., 314 Hall, G., 47, 86, 100, 132, 134, 140, Hollis, K. L., 55, 56, 57, 58, 59, 190
Gibson, J. J., 314 142, 150, 172, 308, 314, 317, 432, Holloway, K. S., 196
Giesen, J. C. A. H., 275 451 Holmberg, J., 448
Gillett, S. R., 301 Hall, S., 58 Holmes, N. M., 391
Gilroy, K. E., 355 Hall, W. G., 370 Holt, D. D., 267
Gino, A., 416 Hallam, S. C., 127 Holt, L. E., Jr., 365
Gisquet-Verrier, P., 177 Hamilton, D. A., 350 Holyoak, K. J., 228, 235, 237, 238
Giurfa, M., 226 Hamilton, W. D., 42 Holz, R., 443
Glasgow, B., 178 Hamlin, A. S., 170 Hommel, B., 393
Glazer, H. I., 433 Hamlin, P. H., 258 Honey, R. C., 142, 153, 317
Gleason, D. I., 218 Hammer, M., 225 Honig, W. K., 26, 308, 312
Gleitman, H., 137, 160 Hammond, L. J., 451 Honzik, C. H., 250, 251, 364, 376,
Gluck, M. A., 173, 229, 233 Hampton, R. R., 356, 357, 358, 359 407
Gobleski, S. A., 395 Hankins, W. G., 213, 216 Horne, M. R., 352, 355
Goddard, M. J., 199 Hanson, H. M., 310, 311 Horton, G. P., 248
Godden, D. R., 165 Hardt, O., 163, 174, 176, 177 Howlett, R. J., 42
González, F., 153, 368, 457 Harris, B., 16 Hsiung, R., 227
Gonzalez, R. C., 312 Harris, J. A., 89, 90, 153, 172 Huber, L., 304
Good, M. A., 142, 219, 352, 353, 355 Harris, R. E., 215, 239 Huber-McDonald, M., 196
Goode, T., 323 Harrison, R. H., 307 Hugart, J. A., 323
Gordijn, M., 335 Haselgrove, M., 90, 127, 134, 135, Hull, C. L., 21, 276, 277, 365, 379
Gordon, W. C., 162, 163, 164, 165, 234, 384 Hulse, S. H., 26
166, 172 Haselton, M. G., 220 Hulse, S. H., Jr., 382
Gormezano, I., 85, 91, 92, 186, 384 Havermans, R. C., 170, 275 Humphrey, G. K., 189
Gottlieb, G., 399 Hayward, A., 352, 355 Hursh, S. R., 272, 273
Gould, J. L., 226 Hayward, L., 56 Hurwitz, H. M., 391
Gould, S. J., 220, 225, 226 Healy, A. F., 248 Hutcheson, D. M., 404
Grace, R. C., 263 Hearst, E., 63, 64, 88, 92, 100, 391 I
Graham, J., 370 Heil, S. H., 271 Iguchi, M. Y., 53
Graham, M., 219 Held, F. P., 92 Imada, H., 65
Grahame, N. J., 127, 128 Hendersen, R. W., 137, 160, 161, Inui, T., 217
Grand, C., 153 162, 370, 434 Irons, G., 16
Grant, B. R., 42 Herling, S., 60 Ison, J. R., 382
Grant, D. L., 188, 193, 194 Hermans, D., 170
Grant, D. S., 327, 328 J
Hermer, L., 346 Jackson, R. L., 161, 433, 434
Grant, P. R., 42 Herrnstein, R. J., 260, 261, 262, 263,
Gredebäck, G., 448 Jackson-Smith, P., 318, 331
264, 265, 296, 297, 298, 417, 418 Jacob, W. J., 216
Green, L., 263, 267, 268, 269, 271, Heth, C. D., 92, 171
273 Jacobs, A., 414
Hetherington, M. M., 447 Jacobs, E. A., 270
Green, S. L., 301, 305 Hicks, L. H., 459
Greene, D., 382 Jacobs, W. J., 215
Higa, J. J., 342, 343 Jan, S., 275
Greene, P., 196 Higgins, S. T., 271, 272, 274
Greggers, U., 225 Janak, P. H., 392, 457
Hilgard, E. R., 8 Jansen, A., 170, 275
Grice, G. R., 206 Hill, P. F., 270
Griffiths, D. P., 332 Jasnow, A. M., 177
Hill, W., 443 Jaynes, J., 163, 366
Griffiths, T. D., 343 Hinde, R. A., 47
Grill, H. J., 213 Jenett, A., 226
Hineline, P. N., 417, 418 Jenkins, H. M., 28, 63, 87, 88, 178,
Grosch, J., 270 Hinnenkamp, J. E., 270
Gross, J., 415 307, 440, 441, 454
Author index   533

Jennings, D., 186 Killeen, P. R., 262, 342, 440 Leaton, R. N., 142
Jensen, S., 459 Kim, J. A., 199 LeBlanc, K. H., 390, 395
Jitsumori, M., 304 Kim, J. J., 87 Leclerc, R., 64
Johansson, A., 448 Kimledorf, D. J., 206 LeDoux, J. E., 27, 86, 87, 173, 174,
Johnson, S., 374 Kindt, M., 174, 175, 176, 186, 239 177, 427
Johnson, A., 459 King, D. A., 62, 166, 167, 168, 171, Lee, C., 435
Johnson, D., 367, 371, 372, 373 195, 437 Lee, J. L. C., 177
Johnson, M. W., 270, 272 Kirby, K. C., 53 Lee, R. K. K., 433
Johnson, P. A., 106 Kirby, K. N., 270 Leitenberg, H., 415
Johnson, P. E., 140, 212, 213 Kirkpatrick-Steger, K., 323 Lejeune, H., 342
Johnson, S. L., 374 Kissinger, S. C., 189 LeMagnen, J., 371
Johnston, J. C., 429 Klein, S. B., 221 Le Magnen, J., 372
Johnston, M., 311 Klossek, U. M. H., 448 Le Moal, M., 402
Jones, D. L., 219 Klosterhalfen, S., 223 Lenz, R., 304
Jones, F. W., 239 Knauss, K. S., 298 Leonard, D. W., 386
Jones, J. E., 344 Koelling, R. A., 9, 55, 70, 71, 206, 207 Le Pelley, M. E., 130, 131, 135, 358
Jones, M. L., 172 Koestner, R., 382 Lepper, M. R., 382
Jones, P. M., 134, 353 Koffarnus, M. N., 272 Lester, L. S., 59, 60, 195, 199, 421,
Jordan, W. P., 142 Konorski, J., 148, 388, 390, 391, 425, 426
Juergensen, M. R., 72 393, 453 Lett, B. T., 211, 218
Juraska, J. M., 434 Koob, G. F., 402 Leung, H. T., 87, 122
K Korol, B., 193 Levinson, S., 16
Kagel, J. H., 263 Kosaki, Y., 353, 458 Levis, D. J., 66, 415
Kaiser, D. H., 318 Kosslyn, S. M., 248 Levison, D. G., 301
Kalat, J. W., 72, 90, 208, 211, 375 Kozak, M. J., 432 Levitsky, D., 367, 439
Kalish, D., 249, 250, 444 Kral, P. A., 72 Levy, S. M., 90
Kalish, H. I., 186, 306 Kramer, P. J., 172 Lewis, D. J., 173
Kamil, A. C., 42, 293, 294, 325, 344 Krank, M. D., 62, 390, 395 Lewis, J. W., 433
Kamin, L. J., 70, 87, 91, 94, 103, 104, Krebs, J. R., 335 Li, X., 4
105, 147, 415, 416, 418, 429 Krehbiel, R., 374 Lieberman, D. A., 211, 219
Kanarek, R. B., 372 Kremer, E. F., 122, 123, 124 Liebeskind, J. C., 433
Kaplan, C. A., 23 Krupa, D. J., 86 Linden, D. R., 432
Kaplan, P. S., 88, 92, 100 Kruschke, J. K., 305 Linggard, R. C., 349
Karpicke, J., 63, 89, 391 Kuhn, T., 154 Lipsitt, L. P., 31, 32
Karpman, M., 281 Kupfer, A. S., 440 Litner, J. S., 428
Kasprow, W. J., 127, 131, 166 Kurth-Nelson, Z., 459 Little, L., 135
Katcher, A. H., 397 Kutlu, M. G., 233 Liu, S. Y., 170
Katz, D. S., 162, 163 Kwok, D. W. S., 64, 90 Livesey, E. J., 153
Katz, J. S., 306, 324, 325, 326 L Locurto, C., 88
Kaye, H., 133, 134, 142, 233, 314, Labus, J. S., 170 Loftus, E. F., 164
315 Lachnit, H., 153 Logue, A. W., 55, 266
Keele, S. W., 305 Lafond, M. V., 256 Loidolt, M., 304
Kehoe, E. J., 85, 91, 101, 171, 186, Lamarre, J., 183 LoLordo, V. M., 92, 94, 97, 98, 214,
233 Lamb, R. J., 53 215, 389, 397, 433, 434, 435
Kelleher, R. T., 256 Lamon, S., 55 Lombardi, B. R., 378
Keller, A. M., 323 Lamoureaux, R. R., 414, 415 Loo, S. K., 227
Kelley, M. J., 421 Lamoureux, J. A., 185, 228, 233 Looney, T. A., 439
Kelly, D. M., 344, 346 Landes, R. D., 270 López, F. J., 236, 238
Kennedy, P. L., 324 Lang, P. J., 186 Lopez, L. J., 161
Kenney, F. A., 169 Langley, C. M., 325 Lopez, M., 369
Kenward, B., 448 Latham, S. B., 384 Lorenz, K., 399
Kern, D. L., 374 Lattal, K. M., 93, 122, 124, 169, 177 Lovaas, O. I., 53
Kesner, R. P., 330 Laurent, V., 393 Loveland, D. H., 296, 298
Kessel, E. L., 43 Lawler, C. P., 439 Lovibond, P. F., 121, 140, 215, 233,
Kessler, D. A., 31 Lawry, J. A., 391 238, 239, 389, 390, 429, 431, 451
Kettlewell, H. B. D., 42 Lazareva, O. F., 312, 321 Lu, L., 170
Khazanchi, S., 384 Lázaro-Muñoz, G., 427 Lubow, R. E., 93
Kiedinger, R. E., 298 Lea, S. E. G., 273, 298, 304, 319 Lucas, G. A., 68, 88, 440
Kiernan, M. J., 317 Leaf, R. C., 55, 431 Lynch, J. F. III, 177
Killcross, S., 173, 432, 457 Leak, T. M., 343 Lyons, R., 58
534  Author index

M McGaugh, J. L., 173, 174, 177, 459, Morris, R. W., 70, 94


MacCrae, M., 101, 171 460 Morrison, S., 312
MacDonald, S. E., 344 McGehee, R. M. F., 370 Morrison, S. D., 90
Macfarlane, D. A., 249 McGeoch, J. A., 164 Moscarello, J. M., 427
Mach, J. L., 271 McGregor, A., 219, 353, 355 Moscovitch, A., 92, 435
Machado, A., 342 McHale, L., 142 Mosteller, F., 117
MacKillop, J., 272 McIntosh, D. C., 211 Motzkin, D. K., 392, 452
Mackintosh, N. J., 64, 95, 106, 129, McLaren, I. P. L., 135, 233, 239, 314, Mowrer, O. H., 380, 388, 413, 414,
130, 131, 135, 140, 233, 236, 304, 316, 352 415
305, 308, 311, 314, 315, 316, 352, McLaren, R. P., 239 Mowrer, R. R., 164
387, 394, 429, 432, 446, 451 McNally, G. P., 170 Mühlberger, A., 426
MacLennan, A. J., 433 McNally, R. J., 198 Mulatero, C. W., 451
Mactutus, C. F., 177 McNish, K. A., 149, 187 Murdaugh, D. L., 374
Madden, G. J., 270, 271 McPhee, L., 374 Murison, R., 434
Mahometa, M. J., 58 McPhee, J. E., 234 Murphy, J. G., 272
Mahon, M., 64 McPhee, L., 374 Murphy, R. A., 238, 240
Mahoney, W. J., 92, 209 McPhillips, S. A., 219 Murray, B., 298
Maidment, N. T., 390, 450 Meachum, C. L., 218 Musty, R. E., 62
Maier, S. F., 56, 64, 92, 431, 432, Meck, M. H., 343 Muszynski, N. M., 226
433, 434 Meck, W. H., 340, 341, 343 Myers, C. E., 173, 233
Majerus, M. E. N., 42 Medin, D. L., 228, 305 Myers, D. E., 370
Majeskie, M. R., 415 Meehl, P. E., 276 Myers, K. P., 214, 370
Mana, M. J., 426 Mehiel, R., 56 Myerson, J., 267, 271
Manns, J. R., 215 Mehta, R., 238 Mystkowski, J. L., 170
Manser, K. L., 142 Melchers, K. G., 153 N
Mansfield, J. G., 61, 188 Melchior, C. L., 62 Nadel, L., 163, 349
Manteigia, R. D., 352 Mensink, G.-J. M., 164, 248 Nader, K., 163, 173, 174, 176, 177
Mar, A. C., 428 Menzel, R., 225, 226 Nagrampa, J. A., 227
Marchand, A. R., 391 Meran, I., 304 Nakajima, S., 443
Marchant, N. J., 4 Mercier, P., 142, 222 Nakajima, S., 65, 122, 124
Marks, I. M., 199 Meyerowitz, B. E., 90 Napier, R. M., 101, 171
Marler, P., 226 Michael, T., 149 Nash, S., 59
Marlin, N. A., 47, 48, 68, 142 Mikulka, P. J., 219 Natelson, B. H., 273
Marsch, L. A., 270 Miller, N. E., 20, 21, 27, 277, 414, Nation, J. R., 385
Marshall, B. S., 85, 91 415 Nederkoorn, C., 275
Martins, A. P. G., 385 Miller, N. Y., 352 Nelson, J. B., 127, 166, 168, 185,
Marusewski, A., 49 Miller, R. M., 166 186, 228
Matell, M. S., 343 Miller, R. R., 47, 48, 92, 121, 127, Neuringer, A., 270, 287
Matthews, R., 350 128, 131, 142, 173, 177, 219, 233, Nevin, J. A., 263, 264
Matthews, R. N., 58 234, 238 Newby, J., 170
Matthews, T. J., 259, 451 Miller, S., 434 Newcombe, N. S., 346
Matute, H., 234, 238 Milton, A. L., 33, 177, 395 Newell, A., 22
Matzel, L. D., 92, 127, 177 Mineka, S., 16, 73, 74, 84, 87, 198, Newman, F. L., 178
Maurer, R., 343 205, 215, 222, 415, 416, 426, 429, Newman, J. L., 271
Mazmanian, D. S., 298 434, 435 Nguyen, K. P., 135
Mazur, J. E., 144, 267 Minor, T. R., 433, 434, 435 Nicholas, D. J., 451
McAllister, D. E., 414, 415 Misanin, J. R., 173, 366 Nickel, M., 27
McAllister, W. R., 414, 415 Mischel, W., 270 Nie, H., 457
McAndrew, A., 239 Mitchell, C., 317 Nisbett, R. E., 382
McCarthy, D. E., 263, 415 Mitchell, C. J., 233, 238, 239, 429 Niv, Y., 459
McClelland, J. L., 24, 143, 230, 302, Mitchell, M. R., 271 Norgren, R., 213
316 Mobbs, D., 195 North, N. C., 58, 196
McCollough, C., 189 Moffitt, T. E., 271 Nosofsky, R. M., 305
McConnell, B. L., 121 Mood, K., 312 Novick, L. R., 235
McCracken, K. M., 164 Moore, B. R., 248, 441
McCuller, T., 385 Moore, J. W., 178 O
McCully, J., 62 Moot, S. A., 335, 388 Oakeshott, S., 83, 90, 135
McCutchan, K., 350 Morgan, C. L., 11 O’Brien, C. P., 399
McDonald, R. J., 461 Morral, A. R., 53 Odling-Smee, F. J., 63
McDowell, J. J., 263, 287 Morris, R. G. M., 347, 348, 349, O’Doherty, J. P., 27, 447, 461
McFarland, D., 43, 46 351, 428 O’Donohoe, W. T., 205
Author index   535

Odum, A. L., 264, 271 Pietrewicz, A. T., 325 153, 171, 178, 182, 183, 184, 185,
O’Flaherty, A. S., 121 Pinel, J. P. J., 426 215, 216, 218, 235, 304, 316, 384,
Öhman, A., 73, 215, 222 Pineño, O., 171, 233 387, 388, 389, 390, 392, 393, 394,
Ohyama, T., 287 Piper, M. E., 415 416, 428, 432, 446, 447, 451, 452,
Oitzl, M., 426 Pitts, E., 219 453, 455, 456, 458
O’Keefe, J., 346, 349 Plaisted, K., 325 Revusky, S. H., 210, 370
Olmstead, M. C., 256 Plath, J. A., 226 Reynolds, G. S., 264, 265, 284
Olton, D. S., 328, 329, 330, 347 Plonsky, M., 434 Reynolds, W. F., 298
Ong, S. Y., 432 Pohorecky, L., 378 Riccio, D. C., 161, 177, 189, 311
O’Reilly, R. C., 87, 233, 317 Poling, A., 27 Richardson, R., 161
Ost, L.-G., 73 Posner, M. I., 305, 326 Richter, C. P., 364, 365, 366
Ostlund, S. B., 390, 450 Postman, L., 164, 276 Richter, J., 199
Overmier, J. B., 98, 391, 394, 397, Poulos, A. M., 27 Ricker, S. T., 167, 171
431, 432, 433, 434 Poulos, C. X., 61 Riley, A. C., 215
P Powell, R. A., 16 Riley, A. L., 422, 423, 424
Packard, M. G., 459, 460, 461 Power, A. E., 177 Riley, D. A., 330
Paletta, M. S., 190, 191, 403 Powley, T. L., 55 Rilling, M., 310
Palmer, D. C., 284, 287 Prados, J., 86, 106 Ringer, M., 218
Palmerino, C. C., 217, 218 Prelec, D., 263 Rish, P. A., 347
Paluch, R., 49 Premack, D., 277, 278, 279, 280, Ritchie, B. F., 249, 250, 444, 459
Pan, M., 222 324, 383 Rizley, R. C., 83, 216
Papas, B. C., 330 Preston, G. C., 140 Roba, L. G., 275
Papini, M. R., 103, 125 Preston, K. L., 53 Robbins, S. J., 231
Parker, L. A., 213, 214 Pryor, K., 255 Robbins, T. W., 33, 395, 404, 428
Parkinson, A. E., 328 Purdy, J. E., 27 Roberts, A., 352
Parsons, P. J., 163, 164 Q Roberts, A. D. L., 350, 351
Paschall, G. Y., 218 Qadri, M. A. J., 323 Roberts, A. E., 391
Patenall, V. R. A., 199 Quinn, J. J., 153 Roberts, S., 337, 341
Patterson, A. E., 64 Roberts, W. A., 172, 295, 298, 327,
R 328, 347
Patterson, J. M., 161 Rabinowitz, V. C., 161
Pauli, P., 426 Robertson, D., 219
Rachlin, H., 256, 263, 266, 268, 269, Robinson, T. E., 89
Pavlov, I. P., 106, 150 274
Payne, D., 88 Robinson, J. L., 49
Raiijmakers, J. G., 164, 248 Robinson, T. E., 33, 89, 192, 395,
Pearce, J. M., 88, 100, 127, 132, 133, Ramirez, K., 255
134, 135, 140, 142, 150, 151, 152, 402, 404
Ramos, B. M. C., 62, 184 Roby, T. B., 277
186, 219, 295, 304, 305, 308, 350, Ramsay, D. S., 193, 372
351, 352, 353, 354, 355, 384, 451 Rodefer, J. S., 48
Ramsey, M., 58 Rodgers, W. L., 375
Pearson, D., 135 Randich, A., 94
Peck, C. A., 65, 172 Rodrigo, T., 352
Rankin, C. H., 48 Rodriguez, G., 172
Pecoraro, N. C., 378 Rapaport, P., 92
Peissig, J. J., 306, 319, 323 Rodriguez, M., 270
Ratcliffe, L., 311 Roelofs, K., 426
Pellón, R., 440 Ratner, A. M., 399, 400
Pennypacker, H. S., 308 Roemmich, J. N., 48, 49
Rawlins, J. N., 349 Roitblat, H. L., 295, 328, 331
Perdeck, A. C., 46 Raymond, G. A., 70
Perin, C. T., 22, 365 Roll, D. L., 68
Rayner, R., 15, 86 Rolls, B. J., 447
Perkins, C. C., Jr., 161 Reberg, D., 64, 99, 101
Perlow, S., 268 Roper, T. J., 273
Redford, J. S., 358 Rosas, J. M., 98, 166, 209, 338, 339
Perruchet, P., 239, 240 Redhead, E. S., 151, 352
Perry, J. L., 271 Rose, S. P. R., 177
Redish, A. D., 459 Rosellini, R. A., 432, 434, 435
Perusini, J. N., 195 Reeks, L. M., 122
Peterson, G., 63, 391 Rosen, J. C., 415
Reid, P. J., 325 Rosengard, C., 169
Peterson, G. B., 394 Reiss, S., 100
Peterson, J., 415 Ross, R. T., 178, 181, 182
Remington, B., 142 Roussel, J., 381
Petrovich, G. D., 374 Remington, G., 433
Petry, N. M., 270, 272 Rovee-Collier, C., 32, 33
Renda, C. R., 270 Rozin, P., 72, 90, 208, 211, 375
Pfeffer, A. O., 439 Renner, K. E., 206
Pharr, V. L., 58 Rudolph, R. L., 307
Reppucci, C. J., 374 Rudy, J. W., 87, 93, 233, 317
Philput, C., 219 Rescorla, R. A., 28, 35, 64, 80, 82,
Pierce, P., 459 Rumelhart, D. E., 23, 24, 143, 230,
83, 87, 90, 98, 100, 101, 106, 114, 232, 302, 316
Pierce, W. D., 383 120, 121, 122, 124, 125, 126, 128, Rünger, D., 457
536  Author index

Rusiniak, K. W., 213, 216, 217, 218, Sherman, J. E., 433 Squibb, R. L., 367
219 Sherman, L., 268 Squire, L. R., 215, 221, 332
Russell, J., 448 Sherry, D. F., 221 Srinivasan, M. V., 226
Rust, T. B., 344 Shettleworth, S. J., 72, 73, 222, 325, Staddon, J. E. R., 54, 256, 282, 283,
Ryan, C. M. E., 298 337, 341, 344, 350, 352, 366, 367, 284, 285, 342, 343, 438, 439, 440
Ryan, R. M., 382 461 Stampfl, T. G., 66
S Shiffrin, R. M., 23, 35, 133, 136, 248 Stanton,C. M., 270
Sagness, K. E., 234 Shimizu, T., 321 Stark, K., 164
St. Claire-Smith, R., 451 Shimp, C. P., 263 Starr, M. D., 401, 402, 403, 416, 429
Sakamoto, J., 4 Shimura, T., 217 Stein, J. S., 270
Saksida, L. M., 314, 335, 336 Shoba, B. C., 374 Steinberg, L., 374
Saladin, M. E., 149 Shoda, Y., 270 Steinmetz, A. B., 27, 86
Salkovkis, P. M., 428 Shogren, R. E., 192 Steinmetz, J. E., 27, 86
Samson, H. H., 439 Shull, R. L., 264 Steirn, J. N., 318, 331
Samuelson, R. J., 328 Siddle, D. A., 142 Sterio, D., 352
Sandoz, J.-C., 226 Sidman, M., 416 Stetson, L., 139
Sanjuan, M. C., 228 Siegel, S., 33, 60, 61, 62, 93, 184, Stevenson-Hinde, J., 72
Sansa, J., 86, 352 188, 189, 192, 193, 199, 382, 403 Steward, O., 177
Santamaria, J., 352 Silva, K. M., 193, 194 Stewart, J., 33, 191, 192, 395, 404
Sargisson, R. J., 327 Silverman, K., 271 Stilman, A. J., 434
Saunders, B. T., 33, 89, 395 Silverman, P. J., 259, 451 Stins, J., 426
Savage, L. M., 432 Silverstein, S. M., 53 Stokes, L. W., 336, 418
Savastano, H. I., 127 Simmelhag, V. L., 54, 284, 285, 438, Stokes, P. D., 287
Sawabini, F. L., 206 440 Stolerman, I. P., 33
Scalera, G., 90 Simon, H. A., 22, 23 Stout, S. C., 127
Schachtman, T. R., 127, 166 Singer, D., 400 Strasser, H. C., 142
Schackleton, S., 311 Singer, R. A., 335 Straub, J. J., 83, 215, 452
Schacter, D. L., 221 Skinner, B. F., 17, 35, 54, 87, 178, Strubbe, J. H., 55, 372
Schafe, G. E., 173 253, 257, 260, 272, 284, 285, 437, Stunkard, A. J., 55
Schaffer, M. M., 305 439, 454 Stuttard, S., 248
Schell, A. M., 215 Skinner, D. M., 350 Suarez, S. D., 59
Schepers, S. T., 442 Sletten, I. W., 193 Sullivan, S., 374
Schleidt, W. M., 47 Slotnick, B. M., 219 Sunsay, C., 139, 145
Schmajuk, N. A., 185, 233 Smith, A. E., 334 Sunstein, C. R., 272
Schneider, W., 35, 133 Smith, B. H., 227 Sutherland, R. J., 349, 350
Schoenfield, W. N., 416 Smith, G. J., 162, 163 Sutton, R. S., 229, 458
Schroeder, G. R., 358, 359 Smith, J. C., 68 Suzuki, S., 177, 329, 344
Schull, J., 403 Smith, J. D., 358 Swartzentruber, D., 140, 160, 168,
Schultz, W., 27 Smith, M. C., 91, 92 169, 170, 184, 185, 219
Sclafani, A., 56, 214, 374 Smith, N. F., 366 Symonds, M., 172
Scobie, S. R., 390 Smith, S. M., 165 T
Seaman, S. F., 401, 403 Smithson, C., 16 Tait, R. W., 149
Seeley, R. J., 372, 373 Snijders, S. J. B. D., 170 Tanaka, S., 65, 443
Selekman, W., 160 Snyder, C. R. R., 326 Tangen, J. M., 238, 239
Seligman, M. E. P., 73, 90, 220, 429, Soeter, M., 174 Tarpy, R. M., 206
431, 432, 433, 434 Solomon, R. L., 92, 388, 389, 390, Taylor, T. L., 433, 434, 435
Sengun, S., 199 396, 397, 398, 401, 416, 432, 446, Teasdale, J., 433
Sevenster, D., 176, 239 453 Teki, S., 343
Shackelford, T. K., 220 Soltysik, S. S., 121 Temple, J. L., 48, 49
Shaham, Y., 4, 170 Soteres, B. J., 349 Templer, V. L., 358, 359
Shahan, T. A., 257 Sovrano, V. A., 346 Terman, G. W., 433
Shamosh, N. A., 270 Sowell, M. K., 218 Ternes, J. W., 399
Shanks, D. R., 153, 215, 222, 223, Speakman, A., 346 Terrace, H. S., 88, 254
228, 229, 230, 231, 233, 234, 236, Spear, N. E., 162, 163, 164, 166 Terry, W. S., 138, 140, 145
238 Spelke, E. S., 346 Testa, T. J., 434
Shavalia, D. A., 330 Spence, K. W., 256, 309, 379 Thaler, R. H., 272
Sheffer, J. D., 56 Spencer, W. A., 48 Thewissen, R., 170
Sheffield, F. D., 277, 366, 441 Spetch, M. L., 306, 311, 323, 344, Thomas, D. A., 161
Sheffield, V. F., 384 345, 346, 352 Thomas, D. R., 161, 311, 312
Shepp, B. E., 135 Spivey, J. E., 386 Thomas, G. V., 211, 219
Sherburne, L. M., 318 Spreat, S. R., 3 Thomas, J. M., 374
Author index   537

Thompson, R. F., 27, 47, 48, 50, von Fersen, L., 304 Wilkie, D. M., 335, 336
86, 95 Vrba, E. S., 220 Wilkinson, D. A., 61
Thorndike, E. L., 13 Vreven, D., 326 Williams, B. A., 255, 256, 258, 263,
Thow, A. M., 275 Vurbic, D., 153, 167, 387, 443 287, 451
Thrailkill, E. A., 256, 457 W Williams, D. A., 98, 231, 234
Tiffany, S. T., 184, 395, 404 Wages, C., 349 Williams, D. R., 388, 441
Timberlake, W., 188, 193, 194, 195, Wagner, A. R., 25, 83, 92, 93, 100, Williams, G. C., 42
197, 281, 282, 367, 378, 437, 440 101, 106, 107, 108, 114, 122, 126, Williams, H., 441
Tinbergen, L., 325 137, 144, 148, 149, 150, 151, 152, Williams, J., 58
Tinbergen, N., 43, 44, 46, 194 153, 186, 187, 190, 191, 199, 222, Williams, J. L., 432
Tinklepaugh, O. L., 376, 378, 407 223, 304, 381, 382, 385, 391, 393, Williams, R. J., 232
Tinsley, M., 378 403, 404 Williams, S. B., 22, 365
Tiunova, A. A., 177 Wahl, G., 195, 437 Wills, A. J., 135
Todd, T. P., 166, 443 Wakefield, J. C., 220 Wilson, E. O., 42
Toledano, D., 177 Wakita, M., 4 Wilson, G. T., 55
Tolman, E. C., 20, 26, 194, 249, 250, Waldmann, M. R., 237, 238 Wilson, J. F., 440
251, 364, 370, 376, 407, 444 Walk, R. D., 314 Wilson, N. E., 9, 71
Tomarken, A. J., 74 Walz, N., 426 Wilson, P. N., 134
Tomie, A., 127, 434, 442 Wang, S.-H., 176, 177 Winger, G., 439
Tooby, J., 221 Ward, J. S., 381 Winterbauer, N. E., 443
Torquato, R. D., 264 Warden, C. J., 364 Wisniewski, L., 48
Tota, M. E., 264 Ward-Robinson, J., 352 Witcher, E. S., 128
Trapold, M. A., 391, 394 Warren, D. A., 434, 435 Witnauer, J. E., 127
Trask, S., 456 Washburn, D. A., 358 Wixted, J. T., 328
Trattner, J., 281 Wasserman, E. A., 63, 68, 222, 235, Wolpe, J., 66, 171
Trauner, M. A., 435 236, 238, 295, 298, 299, 300, 301, Wong, C. J., 53
Treanor, M., 66 302, 303, 306, 312, 319, 321, 323, Wong, P. T., 385
Treisman, A., 323 324, 325, 355 Wood, N. E., 176
Tricomi, E., 447, 458 Wassum, K. M., 450 Wood, W., 457
Triola, S. M., 452, 458 Watanabe, S., 4, 298 Woodbury, C. B., 186
Trost, C. A., 218 Watkins, L. R., 432, 434 Woods, A. M., 171
Trumble, D., 416 Watson, J. B., 15, 16, 86 Woods, S. C., 49, 55, 189, 192, 193,
Tryon, R. C., 46 Watson, P., 393, 394 372
Tulving, E., 221, 332 Watt, A., 457 Wray, J. M., 395
Turner, C., 129, 130, 131 Wearden, J. H., 342 Wright, A. A., 324, 325
Turner, L. H., 389, 397 Weary, D. M., 311 Wrightson, J., 378
U Weaver, M. S., 172 Wulff, J. J., 277
Ulmen, A., 177 Webster, M. M., 90 Wyvell, C. L., 404
Umbricht, A., 53 Weidemann, G., 85, 239, 429 Y
Underwood, B. J., 164 Weingarten, H. P., 374, 376 Yamamoto, T., 217
Urcelay, G. P., 127, 219, 428 Weinstock, S., 384 Yi, R., 270
Urcuioli, P. J., 318, 319, 320, 331 Weisend, M. P., 350 Yohalem, R., 259, 451
Urda, M., 311 Weisman, R. G., 311, 323, 428 Yoshihara, M., 304
Urushihara, K., 65, 234, 443 Weiss, J. M., 433, 434 Young, A. M., 60
V Weiss, S. J., 311 Young, M. E., 306, 312, 323, 324,
Valentin, W., 447 Weissman, R. D., 311 325
Vallee-Tourangeau, F., 222, 240 Werz, M. A., 329 Younger, M. S., 418
Vallortigara, G., 346 Westbrook, F., 219 Z
van den Hout, M., 170 Westbrook, R. F., 87, 122, 153, 172, Zach, R., 29, 30, 51
Vandergriff, D. H., 59 317 Zahorik, D. M., 56, 64
Van Hamme, L. J., 236, 321 Weyant, R. G., 161 Zbozinek, T., 66
van Houten, R., 307 Wheatley, K. L., 92 Zellner, D. A., 215
Vaughan, W., 263, 301, 305 White, K. G., 327, 328, 343 Zentall, T. R., 295, 318, 319, 320,
Vegas, R., 83 White, N. M., 461 327, 331, 334, 335, 355
Vela, E., 165 Whitehead, A. N., 457 Zhang, S., 226
Vervliet, B., 66, 122, 170, 171, 174, Whiting, M. R., 219 Zhou, W., 334, 335
205 Whitlow, J. W., 93, 141 Zimmer-Hart, C. L., 128
Vila, J. C., 62 Wiens, S., 215 Zinbarg, R., 16, 205, 415
Vogel, E. H., 152, 153 Wiers, R. W., 393 Zubek, J. P., 46
Von Culin, J. E., 347 Wiertelak, E., 312
Wilcoxon, H. C., 72
Subject Index
Page numbers in italics indicate affective dynamics, standard pat- of space, 343–355
figures; those followed by t tern of, 397–398 of time, 335–343
indicate tables. affective extension of SOP model animal learning, 4, 12–14, 28. See
A (AESOP), 148–149, 176, 187, also specific animals
A1 state, 144–148, 190, 236 391 in artificial selection
A2 state, 144–148, 190, 236 after-image, 397 experiments, 46
accumulator, in information pro- after-reaction, 397, 404 in classical conditioning, 28–29,
cessing model, 340, 343 aggression, and territoriality, 55–64, 80–90
acquired drive experiment, 414 56–57, 58 compared to human learning,
acquired equivalence, 317–320 agoraphobia, 198, 415, 428 25–27, 234–235
acquired motivation, 376, 399 Ainslie, George, 268 in deprivation experiments, 46
acquisition, 65 Ainslie-Rachlin rule, 268–269 discrimination in, 151, 179, 180,
adaptation, 41, 42–44 Albert B. (Little Albert), 15–16, 86 181–182
animal learning in, 26, 28 alcohol use foraging behavior in, 29–31, 34,
in classical conditioning, 28, body temperature effect in, 188 51, 54, 72
54–64 conditional stimulus in, 442 generality of, 205–208, 227, 241
conditional response in, 190 extinction and renewal effect as incremental process, 11
and exaptation, 220–221 in, 170 in instrumental conditioning,
and extinction, 65 substitutability concept in, 18, 29, 51
and habituation, 50 274–275 memory in, 25, 26, 27, 137, 138,
in instrumental conditioning, taste aversion learning in, 55, 160–161, 164–165
50–54 73, 209 metacognition in, 355–359
learning mechanisms in, 208, tolerance in, 61 Morgan on, 10–11
220, 221 almond extract in conditioning in operational behaviorism,
addiction experiments 20–21
conditioning in, 62–63 and backward blocking, in positive and negative
motivation in, 395, 398–399, 236–237 patterning, 151
401–405 and compound potentiation, in puzzle box experiments,
occasion setters in, 184 216–217 12–13, 246, 248
opponent-process theory of, ambiguity, 171, 172t regulations in research on, 27
398–399, 401–404 of conditional stimulus, 168 relative validity effect in, 222,
withdrawal in, 404 amphetamines, 61 223
adjunctive behaviors, 438, 439–440 Amsel, Abram, 380 Romanes study of, 10
adrenal glands, 364 analgesia in Skinner box experiments, 17,
AESOP (affective extension of endorphins in, 425 18, 19, 28, 245, 252, 253
SOP) model, 148–149, 176, in inescapable shock, 433 spatial behavior in, 343–355
187, 391 analogous traits, 221 of taste aversion, 8–9
affect, 148. See also emotions animal cognition, 295 time cognition in, 335–343
metacognition in, 355–359
540  Subject index

anisomycin, and memory recon- atropine, 192–193 Balleine, Bernard, 368


solidation, 174 attachment, 399–401 Balsam, Peter, 93
antecedent stimulus, 19 in imprinting, 399–401, 402 barbiturates, 61
anticipation, 376–396 attention Barger, Albert (Little Albert),
and frustration, 380–382 in conditioning, 130–136 15–16, 86
antipredator behaviors, 43–44, 195 and generalization gradient, 308 Barlow, David, 198
anxiety disorders, 170, 176 hybrid models on, 135, 136 beacons in spatial behavior,
avoidance behavior in, 415, 427 in inescapable shock, 434 343–344, 350–353
fear conditioning in, 198 in information processing, beak color preferences of zebra
panic disorder in, 198, 199 325–326 finches, 311
safety behaviors in, 428 Mackintosh model on, 130–132, behavioral economics, 272–276
appetitive learning, 436–445 308 demand curves in, 273–274
conditional stimulus in, 89–90, priming of, 326 food choices in, 275–276
395–396 attributions, in learned helpless- independents and complements
overeating in, 31 ness, 433 in, 274–275
Pavlovian-instrumental transfer autism, shaping behavior in, 53 substitutability of reinforcers
in, 389–390, 392–393, 394 automaintenance, negative, 441 in, 272–275
revaluation of US in, 215, 216 automatized behavior, 35 behavioral regulation theory,
vigor of response in, 388 autoshaping 282–284
approach response, 63, 64 and negative automaintenance, behavioral theory of timing,
in sexual behavior system, 196, 441 342–343
197 in pigeons, 68, 87–89, 182–183, behavior chain, 255–256, 441
a-process, 398, 399 222, 440–441 behaviorism
artificial selection, 46 aversions operational, 20–22
association, 7–8 conditional stimulus in, 395–396 of Skinner, 17–19
categorization by, 320 counterconditioning in, 172t of Watson, 15–16
in classical conditioning, 79, 80, taste. See taste aversion learning Behaviorism (Watson), 16
81–82, 83 avoidance behavior, 411–436 The Behavior of Organisms (Skin-
generalization in, 316 cognitive factors in, 426–431 ner), 437
inhibitory, 316 expectancies in, 429, 430 behavior systems, 193–199
laws of, 8 freezing reaction in, 421–424, 427 in feeding, 193, 194–195
in perceptual learning, 316 and law of effect, 51 in panic disorder, 198–199
in sensory preconditioning, 84 learned helplessness in, in predation, 194–195
within-compound, 218–219 431–435 in sexual behavior, 193, 195–197
Associationists, 8, 241 negative occasion setters in, 430 Bernstein, Ilene, 90, 96
associative learning, 225–241 Pavlovian-instrumental transfer Beyond Freedom and Dignity (Skin-
blocking effect in, 237–238 in, 390 ner), 272
in classical conditioning, 79, 80, Pavlovian learning in, 424, bias, inborn, 9
81–82, 83 426–431 bidirectional response systems, 100
dual-process view of, 239 reinforcement of, 52 biomorphs, 285–287
in honeybees, 223–226 in shuttle box experiments, birds. See also specific types of birds
in humans, 228–241 389, 412–416, 418, 419, 421, autoshaping in, 68, 87–89,
inferential reasoning in, 233 427–428, 431 182–183, 222, 440–441
proposition learning in, 239 in Sidman avoidance, 416–417 eggshell removal by, 43, 44
taste aversion in, 220 species-specific defense expert discrimination between
associative strength reactions in, 420–426, 427, 428 types of, 313
of absent cue, 236–237 thigmotaxis in, 421 foraging behavior in, 29–31, 34,
attention affecting, 130 two-factor theory of, 412–420 51, 54
in backward blocking, 236 vigor of, 388 habituation in, 47
of configural cues, 186 warning signals in, 413, 415– imprinting in, 399–401, 402
in external inhibition, 150 420, 428, 429 instrumental conditioning and
Mackintosh model on, 130, 131 awareness, conditional response innate behavior in, 46
in occasion setting, 185 in, 215 mobbing behavior in, 44
in partial reinforcement, B natural selection of, 42
384–385 background cues, 125–126, 160 pecking behavior in. See
Rescorla-Wagner model on, backward blocking, 234–237 pecking behavior
115–124, 166 backward conditioning, 91, 92, 98, reproductive behavior in,
assumptions, Kant on, 8, 241 146, 147–148 58–59, 195–197
asymptote of learning curve, 115, sensory and emotive US nodes Bitterman, Jeff, 226
116, 119, 122, 123 in, 149 black-headed gulls, eggshell
atomistic approach, 7 bait and switch studies, 376–378 removal by, 43
subject index   541

The Blind Watchmaker (Dawkins), and discrimination, 295–305 cigarette smoking. See smoking
285 exemplar theory of, 305 circadian rhythm, 335–336
bliss point, 282–283 feature theory of, 302–304 circular reinforcers, 276
minimum distance model on, mediated generalization in, classical conditioning, 13, 25, 28,
282–284 319, 319t 79–111
blocking effect, 103–106, 114, network models of, 231 adaptation in, 28, 54–64
117–118 by pigeons, 4, 295–305 of appetite, 89–90
in associative learning, 237–238 prototype theory of, 304–305 autoshaping in, 87–89
backward, 234–237 transfer tests of, 300, 301 of avoidance behavior, 426–431
forward, 234, 235, 236, 238 cathexes, in motivated behavior, 370 blocking and unblocking in,
in honeybees, 227 cats, in puzzle box experiments, 103–106
in humans, 228, 234–237 12–13, 246, 248 compared to instrumental
Mackintosh model on, 131, 134 caudate nucleus, 459–460 conditioning, 28–29, 41, 64–74
Pearce-Hall model on, 134 causal learning, 228–237 conditioned compensatory
probabilistic contrast model on, causal power, 237, 238, 241–242 response in, 60–63
235–236 causes, 8 conditioned inhibition in,
Rescorla-Wagner model on, competition between, 237–238 97–101
117–118, 129–130, 131, 134, and effects, 237–241 conditioned suppression in,
235 learning about, 228–237 86–87
retrieval-generated priming chaffinches, 47, 72 CS-US contingencies in,
in, 138 chained schedule of reinforcement, 101–103
in spatial tasks, 350–352 260 of drug use, 60–63
blue gouramis, territoriality and chemotherapy, taste aversion extinction in, 64–66
reproduction of, 56–58 learning in, 90, 96, 106 of eyeblink, 85–86
blue jays, 42 Cheng, Ken, 344–345 of fear, 59–60, 68, 82–83, 86–87
pecking response to digital Cheng, Patricia, 235 frustration in, 381
moths, 293–295, 325 chickens, misbehavior of, 437 generalization of response in,
bobwhite quail, feeding behavior children 81
of, 72 fear conditioning of, 15–16, 86 innate behavior in, 46
body temperature, compensatory food intake of, 373–374 intensity of stimulus in, 94
responses in, 188, 189 learning and memory of learning in, 81–83, 159, 160
b-process, 398, 399, 401, 402–403 infants, 31–33 memory of, 160, 332
brain play of, 31, 278–279 methods for study of, 84–90
cognitive modules in, 222 Premack principle on behavior motivating effects of rewards
computer metaphors of, 22–25 of, 278–279, 281 in, 380, 388
connectionist view of, 24–25 reinforcer devaluation effect in, novelty of stimulus in, 93–94
geometric module in, 346, 353 447–449 overshadowing in, 106
in habit learning, 459–461 salivary response habituation in Pavlov experiments, 55,
of honeybee, 225 in, 48–49 80–81
neuroimaging of, 27 taste aversion learning of, 90, 96 preparedness in, 70–72
neurotransmitter levels in, 433 chimpanzees, relational learning pseudoconditioning in, 95–97
place cells in, 346–347 by, 312 relative validity in, 106–109
in unconditional response, choice, 260–276 of reproductive behavior, 56–59
191–192 in behavioral economics, secondary preconditioning in,
Brandon, Susan, 148, 150, 151 272–276 83, 84
Breland, Keller, 72, 436–437 in concurrent schedule of second-order (higher-order),
Breland, Marian, 72, 436–437 reinforcement, 260–261 83–84
bright-noisy water, 71 contingency management and sensitization in, 95–97
British Empiricists, 7–8 incentive-based approaches sign tracking in, 63–64
Broadbent, Hilary, 341 to, 270, 271–272 size of outcome in, 69, 70
bulimia nervosa, 415 for healthier foods, 272, 275–276 stimulus-outcome learning in,
C matching law on, 261–263, 264 29, 34
cached food, episodic-like memory microchoices in, 347 strength of conditioning in, 87,
of, 332–333, 334 precommitment strategies in, 90–97
calories, flavors associated with, 269 of taste aversion, 68, 70–72, 90
56, 63, 64, 214 Premack principle on, 277–282 timing of outcome in, 67–68, 69
Capaldi, John, 385 and quantitative law of effect, timing of stimulus in, 91–92
catastrophic interference, 231 264–266 timing of trials in, 93
categorization, 228 in self-control and Clayton, Nicky, 332
by association, 320 impulsiveness, 266–271 clock, internal, 340–342, 343
and causal learning, 228–237 Church, Russell, 340, 341 cocaine use, 4, 170, 271, 395
542  Subject index

cocoon-building by spiders, 44–45 compensatory, 188–193 and pseudoconditioning, 95–97


cognition in drug conditioning, 193 in relative validity experiments,
in animals, 295, 335–359 electrodermal, 215 107
and avoidance behavior, in eyeblink conditioning, 85, 187 relevance of, 210
426–431 in panic disorder, 198, 199 in retardation-of-acquisition
complex, 26 in second-order conditioning, 83 test, 99, 100
connectionist view of, 24–25 in sensory preconditioning, 84 salience of, 94, 116, 442
deficits in, 431, 434 in sexual behavior, 196, 197 in second-order conditioning,
and instrumental action, in taste aversion learning, 83, 84
445–461 214–215 sensitization to, 95–97
and metacognition, 355–359 conditional stimulus (CS), 80–81, in sensory preconditioning, 84
of space, 343–355 82, 101–109 in sexual behavior system, 196,
standard model of, 23 activation of US node, 168 197
of time, 335–343 AESOP model on, 148–149 SOP model on, 144–148,
cognitive map, 249, 343, 349 ambiguous, 168 151–152
cognitive psychology, 22, 25, 26 in appetitive conditioning, in stimulus-outcome learning,
cold thoughts, 270 89–90, 395–396 452–454
cold tolerance, 189 attention to, 130–136 in stimulus-response-outcome
color vision in autoshaping, 88–89 learning, 454–455
after-images in, 397 backward, 435 surprise of, 138–139, 140, 142
McCollough effect in, 189 in blocking and unblocking, in taste aversion learning,
comparative psychologists, 10–12, 104–105, 117–118, 129–130 90, 206–207, 210, 211–212,
14, 15 compound. See compound 214–216
comparator, in information pro- conditional stimuli time of presentation and
cessing model, 340 and conditioned inhibition, strength of conditioning,
comparator theories, 127 97–101 91–93, 147
compensatory response, 188–193 configural theory of, 150–151, and unconditional stimulus
conditioned, 60–63, 190–193, 403 186 association, 115, 160, 452
in drug tolerance and context modulating response and unconditional stimulus
dependence, 402–403 to, 178–187 contingencies, 101–103,
unconditioned, 191 context of, 125–126, 140 125–127
competition in explicitly unpaired and unconditional stimulus
between causes, 237–238 procedure, 98 interval, 197
between effects, 238 in external inhibition, 150 and unconditional stimulus
between spatial cues, 350–352 extinction of, 168 pairings, 102–105, 115, 116
complements, in behavioral eco- in eyeblink conditioning, 85, conditioned compensatory re-
nomics, 274–275 138, 148–149, 152, 187 sponse, 60–63
complex behavior and cognitions, in fear, 86–87, 160, 396 conditioned emotional response
26 inhibiting activation of US technique, 86
compound conditional stimuli, 120, node, 168 conditioned excitation, 97
121, 122, 123, 127–128, 217 in inhibition of delay, 98 conditioned inhibition, 97–101
configural theory of, 151, 186 intensity of, 94 in backward conditioning, 148
definition of, 98 meaning of, 168 detection of, 98–100
effect of adding and removing memory of, 25, 137–138, in honeybees, 227
elements in, 152–153 160–162 in humans, 228
predicting US, 117 motivational effects of, 391–394 negative contingency in, 126
compound potentiation, 216–220 node activation, 144–148 Rescorla-Wagner model on,
compound schedule of reinforce- novelty of, 93–94 119–120, 128
ment, 260 as occasion setter, 178–186 retardation-of-acquisition test
compulsive behavior, 442 outcome associated with, of, 99–100
computer metaphors on brain, 452–454 summation test of, 99
22–25 overshadowing by, 106 in taste aversion learning, 212
concurrent measurement studies, in panic disorder, 198, 199 conditioned reflex, 14
388 in Pavlovian-instrumental conditioned reinforcement,
concurrent schedule of reinforce- transfer, 389–391 254–257
ment, 260–261 predicting unconditional of behavior chain, 255–256
matching law on, 261–263 stimulus, 102–103, 114–115, conditioned sensitization, 192, 193
conditional discrimination, 318 116, 117, 124–125, 130, 131 conditioned suppression, 86–87,
conditional response (CR), 81, 82, preexposure to, 93–94, 100–101, 91, 103, 118, 124, 161, 339,
187–200 140, 142, 172, 173, 211–212 415, 416
in appetitive conditioning, 89 priming of, 138–141 conditioning, 12–14, 25
subject index   543

attention in, 130–136 of extinction, 167, 168, 169, 170, Cupiennius salei, cocoon-building
blocking effect in, 103–106, 114, 177–178 behavior in, 44–45
117–118. See also blocking in habituation, 142 D
effect internal state as, 169 Darwin, Charles, 9–10, 285
classical, 28, 79–111. See also in latent inhibition, 140, Dawkins, Richard, 42, 285–287
classical conditioning 172–173 dead reckoning, 343
compound stimuli in. See and memory, 165–166, 167–179 declarative memory, 332
compound conditional modulating response to CS, deer mice, 421, 422
stimuli 178–187 defense reactions
generality of laws on, 205–243 occasion setters in, 184 antipredator behaviors in,
in honeybees, 225–228 temporal, 169 43–44, 195
information value in, 101–109 contextual stimuli, 125–126 freezing in, 421–424, 428
innate behaviors in, 46 contiguity between ideas, 8 recuperative behaviors in,
instrumental, 28, 79–80. See also contiguity theory, 247–249 425–426
instrumental conditioning contingency, 103, 125–127 species-specific, 420–426, 427,
learning in, 81–83 negative, 103, 125, 126, 235, 441 428
Mackintosh model on, 130–132 positive, 103, 235 thigmotaxis in, 421, 426
operant, 18–19, 26, 28, 29 for preferred behavior, 281 deflation experiments, 215–216
Pavlovian. See Pavlovian Premack principle on, 277–278, de la Mettrie, Julien, 6
conditioning 281 delay conditioning, 91, 146
Pearce-Hall model of, 132–134, probabilistic contrast model delay discounting, 267–268,
135, 136, 150, 168 on, 235 270–271
and pseudoconditioning, 95–97 zero, 125, 126, 127, 432 delayed matching-to-sample pro-
Rescorla-Wagner model of, contingency management, 270, cedure (DMTS), 318, 327–328,
114–130. See also Rescorla- 271–272 331–332
Wagner model continuous reinforcement, 257, 384 metacognition in, 356–358
second-order (higher-order), contrast effects, 227, 377–378 delayed rewards, 211, 255, 262–263
83–84, 216 controllability, and stress in ines- Ainslie-Rachlin rule on,
and sensory preconditioning, capable shocks, 434–435 268–269
83, 84 Cook, Robert, 323 discounting of value in, 267–
strength of, 87, 90–97 Corbit, John, 396, 397 268, 270–271
theories of, 113–157 correlation, negative, 98 self-control and impulsiveness
in Watson experiments, 15–16 corticosterone, and negative con- in, 266–271
conditioning preparations, 84 trast effects, 378 delta rule, 229
configural conditioning, 186 Coturnix coturnix japonica, 58–59, generalized, 232
configural cues, 186, 232–233 195–197 demand characteristics, 26
configural nodes, 150–153 counterconditioning, 171–172, 172t demand curves, 273–274
configural theory, 150–153 courtship behaviors, 43, 57–58 Dempes disorder, as fictitious
connectionism, 24–25, 143, 229–233 Couvillon, Pat, 226 disease, 223, 228, 230
pattern completion in, 316 CR. See conditional response depletion-repletion theory, 371–376
connections, 229–233 craving, 395 depression, in learned helpless-
in category and causal learning, creativity, rewards affecting, 382, ness, 433
229–231 384 deprivation experiments, 46
in networks, 229–233 crows, foraging behavior of, 29–31, behavior patterns in, 366–367
of nodes, 24–25, 143, 229–230 34, 51, 54 Drive in, 366–367
consequences, 19 CS. See conditional stimulus incentive learning in, 368–369
selection by, 284–288 cues lever pressing in, 367, 368–369
consolidation of memory, 173 absent, associative strength of, motivation in, 364–365, 366
and reconsolidation, 173–177 236–237 Premack principle on preferred
constructs, theoretical, 20–22, 26 in classical conditioning, 80, 81, behavior in, 281–282
consummatory behavior, 291 108–109 response deprivation
in feeding, 194, 195 compounded, in drug use, hypothesis on, 282
in panic disorder, 198 108–109 Descartes, René, 5–6, 9
sexual, 195–196 configural, 186 The Descent of Man and Selection in
context interoceptive, 199 Relation to Sex (Darwin), 10
background cues in, 125–126 occasion setters as, 178 desensitization, systematic, 171
of conditional stimulus, 125– in panic disorder, 199 Dickinson, Anthony, 332, 368
126, 140 in spatial behavior, 343–346, diets
and CS-US contingencies, 349, 350–353 healthy, 375
125–127 cumulative recording, 257–258 high-protein, 371–372
544  Subject index

thiamine-deficient, 375, 375 as energizing, 366–367 taste aversions in. See taste
differential inhibition, 98 Hull on, 22, 276–277 aversion learning
digestion and motivation, 276–277, 365– temporal cues in, 374
and compensatory responses to 367, 371, 379 economics, behavioral, 272–276
food, 189 needs in, 366 effects, 228
reflexes of, 13–14 Drive reduction theory, 276–277 and causes, 237–241
discrimination drug use, 33, 34 competition between, 238
and categorization, 295–305 addiction in. See addiction law of, 13, 51–52, 64, 246–247,
conditional, 318 classical conditioning in, 60–63 264–266
exemplar theory of, 305 compensatory response in, eggshell removal, 43, 44
by experts, 313 60–63, 188–189, 191, 192–193, elastic commodities, 273
in fading procedure, 254 402–403 electrodermal response, 215, 239,
feature-negative, 178, 179, 180, compound cues in, 108–109 429
181, 183, 185, 309 conditional stimulus in, 442 elemental nodes, 150–153
feature-positive, 178, 179–180, conditioned sensitization in, elemental theories, 150–153
181, 183, 185 192, 193 elicited response, 19
feature theory of, 302–304 context of internal state in, 169 emotional conditioning, AESOP
and generalization, 305–320 discounting delayed rewards model on, 149, 176
negative patterning, 151, 233 in, 270–271 emotional nodes of unconditional
occasion setting in, 178–181 habituation in, 403–404 stimulus, 148–149
perceptual learning in, 313–317 impulsiveness in, 270–271 emotions
positive patterning, 151, 233 memory in, 173, 174–176, 177, after-reaction in, 397
in preexposure, 315–316 270 corticosterone levels in, 378
prototype theory of, 304–305 motivation in, 395, 398–399, deficits in, 431
same/different judgments in, 401–405 frustration in, 381, 387
323–325 occasion setters in, 184 memory of, 148–149, 162, 176
serial, 182, 184–185 opponent-process theory of, in negative contrast, 378
simultaneous, 182, 184, 185 398–399, 401–404 in social attachment, 399–401
in stimulus control, 253, 254, 293 Pavlovian-instrumental transfer and standard pattern of
visual perception in, 321–325 in, 390, 394–395 affective dynamics, 397–398
discriminative inhibition, 98 reinforcement of, 33, 265–266, and unconditional stimulus
discriminative stimulus, 254–255, 274–275 nodes, 148–149
293 substitutability concept in, empidid flies, courtship behavior
in behavior chain, 256 274–275 of, 43
in stimulus-outcome learning, tolerance in, 60–62, 193, empiricism, 7–8, 242
452–453, 454 402–403 endorphins, 425, 426
in stimulus-response-outcome as unconditioned stimulus, in fear conditioning, 59–60
learning, 455–456 188–189, 193 entrainment, 336
dishabituation, 48 dual-process view entropy, 324
DMTS (delayed matching-to-sam- of associative learning, 239 environment
ple procedure), 318, 327–328, of habituation, 50 adaptation to, 41, 42–44. See also
331–332 ducklings, imprinting in, 399–401, adaptation
metacognition in, 356–358 402 in deprivation experiments, 46
dogs, learning by, 10–11 E global representation of, in
avoidance behavior in, 431 eating and feeding behavior spatial tasks, 353–355
in classical conditioning, 13–14, in appetitive learning. See input from, in information
28–29, 55, 80–81 appetitive learning processing, 23
in conditioned reinforcement, in bulimia nervosa, 415 Skinner on, 18
255 of children, 373–374 Watson on, 16
learned helplessness in, classical conditioning of, 55–56, episodic memory, 332–335
431–432 63 Epstein, Leonard, 48
Pavlovian-instrumental transfer depletion-repletion theory of, equivalence, acquired, 317–320
in, 389 371–376 escape behavior, 433–434
in shuttle box experiments, 389, feeding behavior system in, and law of effect, 51
415, 431 193, 194–195 reinforcement of, 52
Domjan, Michael, 195 food preferences in, 374 Estes, William K., 248
drinking behavior habituation in, 49 ethology, 43, 225–226
and need for water, 371–372 meal frequency, size, and cost evolution, 9–10
in schedule-induced in, 373 of analogous and homologous
polydipsia, 439 overeating in, 31, 63, 256, 271 traits, 221
Drive, 365–367 sign tracking in, 63, 87 and behavior, 42–50
subject index   545

compared to reinforcement, Rescorla-Wagner model on, revaluation of US in, 215, 216


284–288 119–122, 128, 166 startle response in, 175–176,
exaptation in, 220–221 sequential theory of, 385–387 186–187
generality and specificity in, and spontaneous recovery, 66, timing of outcome in, 67, 68
220–222 169, 209 of warning signals, 413,
and instrumental conditioning, in taste aversion learning, 209, 415–420
54 218 feathers, 221
learning mechanisms in, 207– extrinsic rewards, 382–384 feature-negative discrimination,
208, 220–222 eyeblink conditioning 178, 179, 180, 181, 183, 185, 309
in natural selection, 42, 54, Perruchet effect in, 239–240 feature-positive discrimination,
284–285, 294–295 and pseudoconditioning, 95 178, 179–180, 181, 183, 185
taste aversion learning in, 207, in rabbits, 85–86, 138, 148–149, feature stimulus, 178
211 152, 187 feature theory, 302–304
exaptation, 220–221 relative validity in, 222 feeding behavior. See eating and
excitation, 103 sensitization in, 95 feeding behavior
conditioned, 97 trace interval in, 206 finches
of fear, 161–162 eyebrow flashing, 45 beak color preferences of, 311
and generalization gradient, eyewitness testimony, accuracy natural selection of, 42
308, 309, 310, 312 of, 164 first-order conditioning, 216
occasion setters affecting, 185 F fish, territoriality and reproduction
Rescorla-Wagner model on, 128 facial images, categorization of, of, 56–58
excitors, 97, 103 304 fitness, 42
positive V value of, 120 fading procedure, 254 in instrumental conditioning, 50
Rescorla-Wagner model on, Fanselow, Michael S., 425 fixed action patterns, 44–45
120, 121–122 fear, 82–83 fixed interval schedule of rein-
summation of, 99, 183 as acquired drive, 414 forcement, 258, 259, 336, 339
exemplar theory, 305 affective dynamics in, 397, 398 fixed ratio schedule of reinforce-
expectancy agoraphobia in, 415, 428 ment, 258
and avoidance behavior, 429, 430 anxiety disorders in, 198 Flaherty, Charles, 377–378
in awareness of CS, 215 avoidance behavior in, 413–416, flavors
motivation in, 387–394 428 and calories, 56, 63, 64, 214
and Perruchet effect in eyeblink backward conditioning in, 92 preferences for, 374
conditioning, 239–240 conditional stimulus in, 86–87, in taste aversion learning. See
experience 160, 396 taste aversion learning
drive acquired in, 414 conditioned suppression in, flexibility of behavior, 249
learning in, 3, 7, 8, 26 86–87, 161, 162, 415, 416 flight response in avoidance be-
motivation acquired in, 376, 399 counterconditioning in, 171 havior, 420, 421
experts, subtle discrimination by, desensitization in, 171 focal sets, 235
313 excitation of, 161–162 food. See also eating and feeding
explicitly unpaired procedure, 98 exposure therapy in, 66, 170 behavior
exposure therapy, 66, 121 extinction of, 66, 121, 169, 170, in appetitive conditioning,
extinction in, 170 176, 177, 184 89–90. See also appetitive
relapse in, 170 freezing reaction in, 59, 70, 87, learning
external inhibition, 150, 152 174, 391, 421–424 in autoshaping of pecking
extinction, 64–66, 97, 101, 166–171, inhibition of, 94, 161–162 behavior, 87–89, 182–183
172t in Little Albert, 15–16, 86 behavior systems related to,
context of, 167, 168, 169, 170, memory in, 160, 161, 173–174, 193, 194–195
177–178 176, 177 caching of, 332–333, 334
in counterconditioning, 171–172 modules in, 222 in classical conditioning
in exposure therapy, 170 occasion setters in, 184 experiments, 13–14, 28–29,
of fear, 66, 121, 169, 170, 176, one-trial learning in, 209 55–56, 70, 80–81
177, 184 Pavlovian-instrumental transfer compensatory responses to, 189
and frustration, 382 in, 390 deprivation of, 278, 279, 364–
of inhibition, 128 perceptual-defensive 365, 366–367, 368
and persistence in partial recuperative model of, 425 in fear conditioning, 86–87
reinforcement, 384–387 potentiation in, 219 foraging for. See foraging
protection from, 121 preparedness in, 73–74 behavior
and rapid reacquisition, 171 in rats, 59–60, 67, 68, 70, 82–83, healthier, choice for, 272,
and reinstatement effect, 171 86–87, 160, 161, 173–174, 186 275–276
and renewal effect, 166–169, relative validity in, 222 in instrumental conditioning
170, 387 renewal of, 170 experiments, 18–19, 28, 29
546  Subject index

memory of animals on, 332–335 generalization gradients, 306–313 Hall-Pearce negative transfer, 132,
need for, 276–277, 371–376 excitatory and inhibitory, 134, 135, 140, 172t
as orienting behavior reward, 308–310, 312 Hampton, Robert, 356
133–134 interactions between, 309–313 hamsters
preferences for, 374 peak shift in, 310, 311, 313 in food deprivation
in puzzle box experiments, transposition in, 312 experiments, 366, 367
12–13, 29 generalized learned irrelevance, 432 navigation and spatial behavior
rearing behavior in response general Pavlovian-instrumental of, 343
to, 181 transfer, 391–394 preparedness in instrumental
as reinforcer, 252, 255, 277–279 genes, 42 conditioning of, 72–73
salivary response to, 48–49 mutations in, 287 solitary feeding of, 188, 194
signals for, 55–56 geometric ions (geons), 321, 322 head-jerking behavior, 181–182, 188
in Skinner box experiments, geometric module in brain, 346, 353 healthy foods, choice for, 272,
18, 28 Gibbon, John, 93, 340 275–276
and social behavior, 188, 194 glucose blood levels in insulin heat lamp, motivational effects
and taste aversion learning. See response, 191, 192 of, 370
taste aversion learning goal reaction, 379–380 hedonic shift, 213–216, 370
food-cup entry of rats, 89–90 fractional anticipatory, 379, 380 hedonism, 6
conditioned reinforcement in, goals helplessness, learned, 431–435, 436
255 and action learning, 459 Helplessness: On Depression, Develop-
priming in, 139–140 and flexibility of behavior, ment and Death (Seligman), 433
foraging behavior 249–250 heroin use, 4, 170
of crows, 29–31, 34, 51, 54 and motivation, 249, 364, 366, conditioned compensatory
of honeybees, 223–226 370 response in, 62
of pigs, 72 in spatial tasks, 350, 459 cues in, 184
forgetting, 66, 161–166, 177, 332 goal tracking, 89 discounting delayed rewards
causes of, 163–166 golden hamsters in, 271
and extinction, 166–171 in food deprivation herring gulls, 44, 46
in generalization decrement, 327 experiments, 366, 367 Herrnstein, Richard, 260
of latent inhibition, 172 navigation and spatial behavior hidden units, 232–233
reminders reducing, 162–163, of, 343 higher-order conditioning, 83–84,
164, 165 preparedness in instrumental 216
forward blocking, 234, 235, 236, 238 conditioning of, 72–73 hippocampus
fractional anticipatory goal reac- goldfish, 227 in memory, 330
tion, 379, 380 Gormezano, Isadore, 85 in spatial behavior, 346–347,
free-operant avoidance, 416–417 gouramis, blue, territoriality and 349, 459, 460
free-running animals, 335 reproduction of, 56–58 historical development of learning
freezing reaction, 118, 195, 420, gratification, delay of, 266, 270–271 theory, 3–37
421–424, 427 gulls Hobbes, Thomas, 6
in fear conditioning, 59, 70, 87, eggshell removal by, 43, 44 Hoffman, Howard, 399–401
174, 391 instrumental conditioning and Holland, Peter, 178
Freud, Sigmund, 15 innate behavior in, 46 Hollis, Karen, 55
frustration, 380–382, 387 mobbing behavior of, 44 homeostasis, 365
in partial reinforcement, 385 gut defense system, 218 homologous traits, 221
G Guthrie, Edwin R., 159, 247–249, 252 honeybees
Gallistel, Randy, 93 H approaching signals, 63, 226
gambler’s fallacy, 240 Habit, 22, 365, 379 blocking effect in, 227
Garcia, John, 55, 70, 206, 216–218 habit learning, 456–461 conditioned inhibition in, 227
generality of laws of conditioning, habituation, 35, 47–50, 95 foraging by, 223–226
205–243 context in, 142 in relative validity experiments,
in matching law, 262–263 dual-process theory of, 50 223–225
generalization, 81, 84 and extinction, 66 short-term memory of, 226
and acquired equivalence, and latent inhibition, 142, 172 successive negative contrast
317–320 and opponent process theory, effects in, 227
decrement in, 150, 327 403–404 hot thoughts, 270
and discrimination, 305–320 priming in, 141–142 Hull, Clark L., 18, 21–22, 246, 252,
gradient of, 306–313 reinforcer devaluation effect in, 276–277, 365
mediated, 317–320 456–459 humans
and perceptual learning, spontaneous recovery in, 48, 66 adjunctive behaviors in, 440
313–317 stimulus-specific, 48–49 associative learning in, 228–241
temporal, 336–337 Hall, Geoffrey, 132–134 avoidance learning in, 429–430
subject index   547

blocking effects in, 228, 234–237 impulsiveness, 266–271 occasion setters affecting, 185,
category and causal learning in, inactive nodes, 144, 145 186
228–237 incentive-based treatment, 270, and perceptual learning, 316
conditional response in, 215 271–272 Rescorla-Wagner model on, 119–
conditioned inhibition in, 228 incentive learning, 368–370, 379, 122, 128–129, 138–139, 166
in conditioning experiments, 404–405 SOP model on, 147–148
15–16 and reinforcer devaluation inhibitors, 97, 98–100, 103
defense reactions in, 426 effect, 449–450 negative V value of, 120
delay discounting in, 267–268 incentive motivation, 379–380 summation of, 99, 183
drug use of, 33, 34. See also drug Incentive Relativity (Flaherty), innate behaviors, 44–46
use 377–378 fixed action patterns in, 44, 45
eating and overeating in, 31, 63, independents, in behavioral eco- instinctive drift, 437
256, 271 nomics, 274–275 instrumental action, 293–361
fixed action patterns in, 45 induction, 53 cognitive analysis of, 445–461
forgetting by, 165 industriousness, learned, 385 goal-directed, 364, 366
infant learning and memory, inelastic commodities, 273 motivation of, 363–408
31–33 inescapable shocks, 436 synthetic perspective of,
instrumental conditioning in, learned helplessness in, 411–464
50–51 431–435 instrumental conditioning, 28, 41,
learned helplessness in, infants 79–80
432–433 as depletion-repletion feeders, adaptation in, 50–54
as machines, 5–6 373 compared to classical
matching law in, 262 fear conditioning of, 15–16, 86 conditioning, 28–29, 41, 64–74
metacognition in, 355–356 learning and memory of, 31–33 cumulative recording of
mind-body distinction in, 5 shaping behavior of, 52–53 responses in, 257–258
paradoxical reward effects in, smiling of, 50–51 extinction in, 64–66
382–384 inferential reasoning, 233 flexibility of behavior in, 249
Pavlovian-instrumental transfer inflation treatment, 215 innate behavior in, 46
in, 393 information processing, 26, law of effect in, 51–52, 64
play of children, 31, 278 320–335 matching law in, 261–263
potentiated startle effects in, attention in, 325–326 preparedness in, 72–74
186–187 computer metaphor in, 22–25 reinforcement in, 52, 246,
preparedness in, 73–74 connectionist view of, 24–25 254–288
reflex actions of, 5–6 long-term memory in, 23, 136, response-outcome learning in,
reinforcer devaluation effect in, 137, 332–335 29, 34, 245–246
447–449, 458 priming in, 136, 137–141 shaping in, 52–54, 72
relative validity effect in, short-term memory in, 23, size of outcome in, 69–70
222–223 136–137, 320 Skinner on, 17–19, 247, 252–254,
salivary response in, 48–49 speed of, 23 276, 285
shaping behavior in, 52–53 time cognition in, 340–341 stimulus control in, 253–254
taste aversion learning in, 55, visual perception in, 321–325 superstitious behavior in, 253
73, 90, 96, 106, 207 working memory in, 320, timing of outcome in, 67, 68–69
visual perception in, 321, 323 326–332 voluntary behavior in, 18–19,
Hume, David, 7, 8, 241 inhibition, 9, 103 253
hunger associations in, 316 insulin, compensatory response to,
behavior patterns in, 366–367 in backward conditioning, 98, 191, 192
incentive learning in, 368–369, 147–148 intensity of stimulus, and strength
449 in bidirectional response of conditioning, 94
motivation in, 364–365, 366, systems, 100 interference, 164, 166, 171, 173, 177
368, 370, 371–376, 404 conditioned, 97–101. See also catastrophic, 231
needs and drive in, 276–277 conditioned inhibition in long-delay learning, 210–211
Pavlovian-instrumental transfer in connectionist approach, 25 proactive, 164, 328
in, 392–393 of delay, 98, 336, 339 retroactive, 164, 328
response to need in, 371–376 differential or discriminative, in taste aversion learning, 210,
hybrid attentional models, 135, 136 98 212
I external, 150, 152 verbal, 172t
immunization effect, 432, 434, 435 extinction of, 128 interim behaviors, 438, 439, 440
impressions, associations between, of fear, 161–162 internal clock, 340–342, 343
7–8 and generalization gradient, interoceptive cues, 199
imprinting, 399–401, 402 308–309, 310, 311, 312 intertrial intervals, 92, 93
latent. See latent inhibition
548  Subject index

interval schedule of reinforcement, definition of, 4 working memory in, 326–328


258, 259–260, 451–452 impact of variation/selection maze experiments, 18
matching law on, 284 idea on, 287 artificial selection in, 46
in peak procedure, 337–338 philosophical roots of, 5–9 delayed reward in, 211
response rate in, 452 lever pressing, 18, 245, 252 episodic-like memory in,
scalar property in, 338 in acquired drive experiment, 334–335
superposition of response 414 flexibility of behavior in,
curves in, 337, 338, 339 in avoidance behavior, 417, 418, 249–251
in temporal bisection 420, 427–428 habit learning in, 459–461
procedure, 338–339 behavior chain in, 255–256 latent learning in, 251, 252
time cognition in, 336–339 conditioned reinforcement in, motivation in, 249–251, 364
intervening variables, 20–21, 22 255 radial maze in, 328–330, 334–
intrinsic motivation, 382 cumulative recording of, 335, 344
introspection, 14 257–258 reference memory in, 330,
J in deprivation experiments, 334–335
Japanese quail, reproductive be- 367, 368–369 rewards in, 249–252
havior of, 58–59, 195–197 incentive learning in, 368–369, spatial behavior in, 344, 346–
449–450 355, 459
K Pavlovian-instrumental transfer water maze in, 249, 347–349,
Kamin, Leon, 103–105, 114, 117, 125 in, 390, 392 350–351
Kant, Immanuel, 8, 9, 241–242 Premack principle on, 277–278, working memory in, 328–331
kittiwakes, 43 280, 281 McCollough effect, 189
klaxon device, in fear conditioning punishment in, 280, 443–444 McLaren, Ian, 314
of rats, 82–83 renewal effect in, 170 meal frequency, size, and cost, 373
Kremer, E. F., 122, 123, 124 stimulus control in, 253, 293 mediated generalization, 317–320
L in temporal bisection in categorization, 319
landmarks, as cues in spatial be- procedure, 338–339 medicine effect, 64
havior, 344, 347, 348, 352 lithium chloride, in taste aversion melioration process, 263
latent inhibition, 93–94, 100–101, learning trials, 209, 210, 214, memory, 160–177
134, 172–173 216, 446, 449 in animal and human learning,
context in, 140, 172–173 Little Albert, conditioning experi- 25, 26, 27
forgetting in, 172 ments on, 15–16, 86 of categories, transfer tests of,
and habituation, 142, 172 locale system in spatial behavior, 300, 301
learning in, 173 349 in classical conditioning, 160, 332
Mackintosh model on, 131 Locke, John, 7 conditional stimuli in retrieval
priming in, 140, 142 long-delay learning, 206, 209–211 of, 25, 137–138
Rescorla-Wagner model on, long-term memory, 332–335 consolidation of, 173
128–129, 138–139 in information processing, 23, context affecting, 165–166,
in taste aversion learning, 212 136, 137, 332–335 167–179
latent learning, 251, 252 in maze experiments, 330 declarative, 332
law of effect, 13, 51–52, 64, 246–247 nodes in, 143 episodic, 332–335
quantitative, 264–266 retrieval-generated priming erasure of, 173–177
law of parsimony, 11 of, 137 evolution of, 221–222
laws of conditioning, generality of, in time cognition, 340 and extinction, 166–171
205–243 Lorenz, Konrad, 44, 399 and forgetting, 161–166. See also
learned helplessness, 431–435, 436 M forgetting
learned helplessness effect, 432 machines, view of people as, 5–6 of infants, 32–33
learned helplessness hypothesis, Mackintosh, Nicholas, 129, 130, in information processing, 23,
432 135, 314 136–137, 320, 326–335
learned industriousness, 385 Mackintosh model, 130–132, 134, interferences affecting, 164, 166,
learned irrelevance, 432 135, 136 171, 173, 177, 328
learning curve, 145 Maier, Steven, 431 and learning, 160–177
asymptote of, 115, 116, 119, 122, maltodextrin, 374 long-term. See long-term
123 Man a Machine (de la Mettrie), 6 memory
contiguity theory of, 247 Mapletoff ice cream, in taste aver- in matching-to-sample task,
Rescorla-Wagner model of, sion experiment, 96, 96 326–327, 356–358
115–117 Margaret image, 296, 297, 302 and metacognition, 356–358
learning/performance distinction, massed trials, 93 and metamemory, 356
251–252, 364 matching law, 261–263, 264, 276, 284 nodes in, 143–148. See also nodes
Learning Theory, 3–37, 159, 187, 445 matching-to-sample procedures, 318 in partial reinforcement, 385,
biological roots of, 9–14 delayed, 318, 327–328, 356–358 386–387
subject index   549

practice improving, 327 monkeys and Drive, 276–277, 366


priming of, 136, 137–141 in bait and switch experiments, and motivation, 276–277, 366,
procedural, 332 376 371–376
prospective code in, 330, 331 fear learning of, 73–74 negative automaintenance, 441
rapid reacquisition of, 171 metacognition of, 356–359 negative contingency, 103, 125,
reactivation of, 163, 173, 174 Premack principle on behavior 126, 235, 441
reconsolidation of, 173–177 of, 278 negative contrast effects, 377–378,
reference, 330, 332–335 Morgan, C. Lloyd, 10–11 382
reinstatement effect, 171 Morgan’s canon, 11 negative correlation, 98
reminders improving, 162–163, morphine, 188, 190 negative occasion setters, 181, 183,
164, 165 in taste aversion learning trials, 185–186
renewal effect in, 166–169, 170 216 in avoidance behavior, 430
retrieval failure, 164–165, 166, tolerance to, 60–62, 216 negative patterning, 151, 232, 233
177 Morris, Richard, 347, 351 negative reinforcement, 52
retrospective code in, 330, 331 moths negative sign tracking, 64
semantic, 332 digital, blue jay pecking negative transfer, 132, 134, 135,
sensory, 23, 136 response to, 293–295 140, 172t
short-term. See short-term natural selection of, 42, 54, nervous system
memory 284–285, 294–295 brain in. See brain
in spaced trials, 93 motivation, 363–408 networks in, 24–25, 143
spontaneous recovery of, 166 acquired, 376, 399 in predation threat, 195
stimulus sampling theory of, in anticipation of reward and transmission in, 9
248 punishment, 376–396 in unconditional response,
and time cognition, 338–339, bait and switch studies of, 191–192
340–342 376–378 networks
trace decay of, 163–164 cathexes in, 370 connections in, 229–233
working. See working memory conditional stimulus in, hidden units in, 232–233
memory nodes, 143–148. See also 391–394 inputs and outputs in, 230, 231,
nodes deficits in, 431 232
metacognition, 355–359 depletion-repletion theory of, neural, 24–25, 143
metamemory, 356 371–376 neurotransmitters, 433
methamphetamine, affecting time and Drive, 276–277, 365–367, 371 nodes, 143–144
cognition, 341 excitation and inhibition in, 97 in A1 and A2 states, 144–148
mice, defense reactions of, 421–422 expectancies in, 387–394 activation of, 143–148, 168
microchoices, 347 goal reaction in, 379–380 association of, 144, 145
midazolam, 62 of instrumental action, 363–408 configural, 150–153
Miller, Neal, 277, 414 intrinsic, extrinsic rewards connections of, 24–25, 143,
Miller, Ralph R., 127 affecting, 382 229–230
mind and learning, 364–366, 368–370, elemental, 150–153
atomistic approach to, 7 371, 404 inactive, 144
as blank slate, 7 needs in, 276–277, 366, 371–376 in long-term memory, 143
and body distinction, 5 opponent-process theory of, in SOP model, 144–148
evolution of, 10 396–404 nonrewarded trials in partial rein-
rationalism approach to, 8 rewards in, 270, 376–396 forcement, 385–387
structuralism approach to, 14 Mowrer, O. H., 413 novelty of stimulus, and strength
Mineka, Susan, 74, 198 multiple oscillator model, 341–342 of conditioning, 93–94
minimum distance model, 282–284 multiple schedules of reinforce- Nudge: Improving Decisions About
misbehavior, 436–437 ment, 260 Health, Wealth, and Happiness
Mischel, Walter, 270 multiple-time-scale model, 342 (Thaler & Sunstein), 272
mobbing behavior, 44, 47 mutations, selection of, 287 O
mobiles, infant interactions with, 32 N obesity, habituation in, 49
modulation processes, 177–187 natural selection, 42, 54, 284–285 object learning, 80
modules, 222 compared to reinforcement, obsessive-compulsive disorders,
geometric, 346, 353 284–288 394, 415
momentum of behavior, 263–264 of moths, 42, 54, 284–285, occasion setters, 178–186
Monet paintings, pigeon classifica- 294–295 negative, 181, 183, 185–186, 430
tion of, 4, 298 need positive, 181, 183, 185
money, as conditioned reinforcer, depletion-repletion theory of, properties of, 181–183, 184
255 371–376 occasion setting, 178–186
550  Subject index

in avoidance behavior, 430 in inescapable shock, 433 in response to digital moths,


in configural conditioning, 186 perceptual-defensive 293–295, 325
discriminative stimulus in, 254, recuperative model of, 425 retrospective and prospective
256 panic disorder coding in, 331–332
in stimulus-response-outcome avoidance behavior in, 415, 427 in stimulus-response-outcome
learning, 454–456 behavior system in, 198–199 learning, 455
odors, and taste aversion, 216–219 paradoxical reward effects, in superstition experiment,
Olton, David, 328 226–227, 381–384 438–439, 440
omission contingency, 441 parallel distributed processing, 24 in terminal behavior, 438–439
omission training, and law of Parker, Linda, 214 time cognition in, 335, 336
effect, 51 parsimony, law of, 11 visual perception in, 321–325
ondansetron, 449–450 partial reinforcement extinction working memory in, 327–328,
one-trial learning, 208–209 effect, 384–387 331–332
On the Origin of Species by Means pattern completion, 316 pepper moths, natural selection of,
of Natural Selection (Darwin), patterning, 151 42, 284–285
9, 10 negative, 151, 232, 233 perceptual-defensive-recuperative
operant, 18, 19 positive, 151, 233 model of fear and pain, 425
operant conditioning, 18–19, 26, Pavlov, Ivan, 13–14, 28–29, 97, 150, perceptual learning, 313–317, 320
28, 29 166 preexposure in, 315–316
operant experiment, 18–19, 26 conditioning experiments of. performance
operational behaviorism, 20–22 See Pavlovian conditioning differentiated from learning,
opponent process, 144, 396–397, Pavlovian conditioning, 13–14, 251–252, 364
399, 403 80–81, 388–389 rewards affecting, 384
opponent-process theory, 396–404 avoidance behavior in, 424, Perruchet, Pierre, 239
ordinal predictions, 99 426–431 Perruchet effect, 239–240
orienting response in rats, 47, conditioned inhibition in, 97 persistence of behavior, 364
133–134, 347 external inhibition in, 150 in partial reinforcement,
oscillators, multiple, 341–342 signals for food in, 55 384–387
outcome, 33–35, 445, 446 S-O learning in, 245 philosophical roots of Learning
in classical conditioning, 28–29, spontaneous recovery in, 166 Theory, 5–9
79, 245 Pavlovian-instrumental transfer, Phipp’s syndrome, as fictitious
conditioning with drugs as, 389–391, 453 disease, 223, 228, 230
60–63 general, 391–394 Picasso paintings, pigeon classifi-
and extinction, 65 outcome-specific, 393–394 cation of, 4, 298
in instrumental conditioning, payoff, 257–276 pigeons
29, 51–52, 245–246 peak procedure, 337–338 attention of, 325, 326
in Pavlovian-instrumental peak shift in generalization gradi- autoshaping behavior in, 68,
transfer, 393–394 ent, 310, 311, 313 87–89, 182–183, 222, 440–441
in response-outcome learning, Pearce, John, 132–134, 135, 150 blocking in, 234
245–246 configural theory of, 150–151 categorization by, 4, 295–305
and response relation, 446–452 Pearce-Hall model, 132–134, 135, choice for delayed reward in,
in sign tracking, 63–64 136, 150, 168 268–269
size of, 69–70 pecking behavior, 72 concurrent schedules of
and stimulus-outcome learning, attention in, 325–326 reinforcement in, 260–261, 262
245–246, 452–454 autoshaping of, 87–89, 182–183, discrimination by, 254, 305–311,
timing of, 66–69 440–441 323
overeating, 31, 63, 256, 271 categorization in, 295–305, 319 fading procedure in training
overexpectation effect, 123, 124 in concurrent schedules of of, 254
Overmier, J. Bruce, 431 reinforcement, 260–261, 262 generalization gradient in,
overshadowing, 106, 218–219 discrimination in, 254, 295–311 306–311
owls, chaffinch calling in response generalization gradient in, impulsiveness and self-control
to, 47 306–311 in, 268–269, 270
P impulsiveness and self-control in matching-to-sample
pacemaker, in information pro- in, 268–269, 270 experiments, 318, 327–328,
cessing model, 340, 341, 343 in matching-to-sample 331–332
pacifiers, and sucking behavior of procedure, 318, 327–328, primary reinforcement in, 256
infants, 31–32 331–332 quantitative law of effect in,
pacing, in sexual behavior system, mediated generalization in, 319 264–265
196, 197 in negative automaintenance, retrospective and prospective
pain 441 coding of, 331–332
endorphins in, 425 primary reinforcement of, 256 sign tracking of, 63, 64, 87
subject index   551

spatial behavior of, 344, 345 food, 374 punishers, 51


superstitious behavior of, 253, place, hippocampus in, 459 punishment, 442–445
437–439 Premack principle on, 277–282 anticipation of, 376–396
time cognition of, 335, 336, 339 response deprivation and law of effect, 51
transfer tests of, 300, 301 hypothesis on, 282 Premack principle on, 280
vision of, 295, 321–325 testing of, 280–281 by rewards, 382–384
working memory of, 327–328, Premack, David, 277 size of, 69–70
331–332 Premack principle, 277–282 stimulus and response learning
pigs limitations of, 280–282 in, 442–444
foraging behavior of, 72 preparatory response, 391 timing of, 67
misbehavior of, 437 in panic disorder, 198, 199 pupil dilation, 192–193
place cells, 346–347 preparedness, 70–74, 207–208, 222, puzzle box experiments, 12–13, 29,
place learning, 346–347, 459–461 420 246, 248
play of children, 31, 278–279 prey species Q
poisons, and taste aversion learn- antipredator behaviors of, quail
ing, 55, 68, 69, 72, 207, 211, 218 43–44, 195 bobwhite, 72
polydipsia, schedule-induced, 439 defense reactions of, 420–426 Japanese, 58–59, 195–197
positive contingency, 103, 235 freezing reaction of, 59, 195, quantitative law of effect, 264–266
positive contrast effects, 377–378 420, 421–424, 425, 427
positive occasion setters, 181, 183, natural selection of, 42, 54, R
185 284–285, 294–295 rabbits
positive patterning, 151, 233 recuperative behaviors of, eyeblink conditioning in, 85–86,
positive reinforcement, 52 425–426 138, 148–149, 152, 187
post-encounter mode, 195 price in relative validity experiments,
posttraumatic stress disorder, 149 and demand for commodity, 223
potentiation 273–274 raccoons, 437
compound, 216–220 of healthy foods, 275–276 Rachlin, Howard, 268
of startle response, 186–187 and substitutability of radial mazes
precommitment strategies, 269 reinforcers, 274–275 episodic-like memory in,
preconditioning, sensory, 83, 84 primary reinforcers, 255, 256–257 334–335
predation priming, 136, 137–141 spatial cues and learning in,
antipredator behaviors in, attentional, 326 344, 346–347
43–44, 195 of conditional stimulus, working memory in, 328–330
attention in, 325–326 138–141 radical behaviorism of Skinner,
behavior system in, 194–195 in habituation, 141–142 17–19
freezing reaction in, 59, 195, in latent inhibition, 140, 142 rapid reacquisition effect, 171
420, 421–424, 425, 427 retrieval-generated, 136, 137, rationalism, 8, 242
imminence of, 195, 426 140–141, 142 ratio schedule of reinforcement,
mobbing behavior in, 44, 47 self-generated, 136, 137, 258, 259–260, 451–452
and natural selection, 42, 54, 141–142 minimum distance model on,
284–285, 294–295 in SOP model, 145–147 283–284
recuperative behaviors in, of unconditional stimulus, 138, response rate in, 452
425–426 140 rats
search image in, 325–326 proactive interference, 164, 328 in acquired drive experiment,
species-specific defense probabilistic contrast model, 414
reaction in, 420–426 235–236 activity after morphine
and stimulus control, 294–295 procedural memory, 332 injection, 190, 191
prediction error, 116 proposition learning, 239, 240 antipredator behaviors of, 195
predictive value (V), Rescorla- propranolol, and memory recon- appetitive conditioning of,
Wagner model on, 115–124 solidation, 174–176 89–90
preencounter mode, 195, 199 prosocial behavior, reinforcement approach response of, 63, 64
preexposure effect of, 266 in artificial selection
of conditional stimulus, 101 prospective code, 330, 331 experiments, 46
in perceptual learning, 315–316 protein avoidance learning of, 412–424,
in taste aversion learning, in diet, and water requirements, 427–428
211–212, 315–316 371–372 in bait and switch experiments,
of unconditional stimulus, 94 synthesis inhibitors affecting 376
preferences memory reconsolidation, blocking effect in, 103–105,
beak color, 311 173, 174 117–118, 234, 236–237
behavioral regulation theory prototype theory, 304–305 cold tolerance of, 189
on, 282 pseudoconditioning, 95–97 conditioned reinforcement of,
255
552  Subject index

CS-US contingencies in 446–447, 449, 457 conditioned, 254–257


conditioning of, 102 renewal effect in, 166–167, 169, and contiguity theory, 247–249
cumulative recording of 170 in contingency management
behavior, 257–258 response to discriminative and incentive-based
defense reactions of, 421–424, stimulus, 452–453 treatments, 270, 271–272
428 rotational errors of, 344–345, continuous, 257, 384
discrimination by, 314 346, 354 definition of, 276
drinking behavior and water schedule-induced polydipsia delayed, 255, 256, 262–263,
need of, 371–372, 439 of, 439 266–271
drug tolerance in, 61–62 self-generated priming in, Drive reduction theory of,
episodic-like memory of, 139–140 276–277
334–335 shaping behavior of, 52, 53 of drug use, 33, 265–266,
fear conditioning of, 59–60, 67, sign tracking of, 63–64 274–275
82–83, 86–87, 160, 173–174, 186 in Skinner box experiments, and extinction, 65, 66, 384–387
feeding behavior of, 194–195, 17, 18 in instrumental conditioning, 52
372, 374 social behavior of, 188, 194 and latent learning, 251, 252
flexible goal-oriented behavior spatial behavior of, 344–355 matching law on, 261–263, 264,
of, 249–250 startle response of, 47, 186 276, 284
food choices of, 55, 56, 63 stimulus and response learning melioration process in, 263
food-cup entry of, 89–90, of, 443–444 minimum distance model on,
139–140, 255 stimulus-response-outcome 282–284
in food deprivation learning of, 455–456 partial, 384–387
experiments, 365, 366, 367, taste aversion learning of, 8–9, Premack principle on, 277–282
368–369 55, 67–68, 70–72, 90, 206, primary, 255, 256–257
forgetting by, 162–163, 164–165 210–220, 315–316, 375 of prosocial behavior, 266
freezing reaction of, 59, 87, 118, thiamine-deficient diet in, 375 and quantitative law of effect,
174, 421–424, 427 time cognition of, 335, 336, 337, 264–266
frustration of, 381 338–339, 341, 342 schedule of. See schedule of
goal reaction of, 379–380 working memory of, 328–331 reinforcement
habituation in, 47, 459 Rayner, Rosalie, 15, 16 secondary, 255
head-jerking behavior of, reactivation of memory, 163 selection by consequences in,
181–182, 188 reconsolidation in, 173, 174, 284–288
incentive learning of, 368–370 176–177 shaping behavior with, 52–54,
latent learning of, 251, 252 rearing behavior, 181–182, 188 253
lever pressing by. See lever recognition by components theory, stamping-in type of, 13, 246,
pressing 321 276, 284, 287, 376, 379, 440,
in maze experiments, 46, 211, reconditioning, 101 446, 447
249–251, 328–331. See also reconsolidation of memory, substitutability in, 272–275
maze experiments 173–177 of superstitious behavior, 253
motivation of, 364–365 prevention of, 173–176 theories of, 276–288
negative and positive contrast recuperative behaviors, 425–426 timing of, 67, 256–260
effects in, 377–378 reference memory, 330, 332–335 value of, 263, 267–268
in operational behaviorism, in time cognition, 340, 341, 342 reinforcement learning models,
20–21 reflexes, 5–6 458–459
orienting response of, 47, conditioned, 13–14 reinforcer devaluation effect,
133–134, 347 eyeblink, 85–86 446–449, 452
paradoxical reward effects in, inhibition of, 9 in habit learning, 457–459
227 stimulus and response in, 17–18 reinforcers, 18
partial reinforcement extinction Reflexes of the Brain (Sechenov), 9 conditioned, 255
effect in, 385, 386–387 reinforcement, 246–249, 254–288 primary, 255, 256–257
Pavlovian-instrumental transfer Ainslie-Rachlin rule on, secondary, 255
in, 390, 392–393 268–269 sensory aspects of, 393, 394
perceptual learning of, 314, in behavioral economics, value of, 446–449
315–317 272–276 reinstatement effect, 171
Premack principle on behavior behavioral regulation theory of, relapse after therapy, 170
of, 277–278, 279–280 282–284 relative validity, 106–109, 222–225
punishment of, 443–444 of behavior chain, 255–256 in conditioning, 106–109
rearing behavior of, 181–182, 188 and choice, 260–276 generality of, 222–225
reference memory of, 330, circularity in, 276 in symptom-disease situation,
334–335 compared to evolution by 230
reinforcer devaluation effect in, natural selection, 284–288 releasers, in fixed action patterns, 44
subject index   553

reminders improving memory, in drug use, 33 S


162–163, 164, 165 in eating and overeating, 31 saccharin
renewal effect, 166–169, 170, 172, preparedness in, 70–74 need and drive for, 277
387 and signal learning, 64–74 in taste aversion learning trials,
repetition and stimulus learning, 245–246, 209, 210, 215–216
attention in, 326 440–442 safety behaviors in anxiety disor-
habits formed in, 35, 458 response-outcome learning, 29, 34, ders, 428
of rewarded and nonrewarded 35, 245, 446–452 safety learning, 211–213, 427
trials, 386 in eating and overeating, 31 salivation
replaced elements, 152–153 in punishment, 442–444 in dogs, conditioning of, 13–14,
reproductive behavior and stimulus-outcome learning, 55, 80–81
beak color preferences in, 311 245–246, 440–444, 450–451, 454 habituation of, 48–49
behavior systems in, 193, retardation-of-acquisition test, and pupil dilation, 193
195–197 99–100, 101 salt, hunger for, 364–365
classical conditioning in, 56–59 retrieval failure, 164–165, 166, 177 salty taste, 212
fitness and success in, 42 retrieval-generated priming, 136, same/different judgments,
need and drive in, 277 137, 140–141 323–325
territoriality in, 56–59 in habituation, 142 satiety, sensory-specific, 447
Rescorla, Robert, 101–103, 114, 182 retroactive interference, 164, 328 scalar property of interval timing,
Rescorla-Wagner model, 114–130, retrospective code, 330, 331 338
135 revaluation effect, 216 scalloping pattern of behavior, in
on blocking effect, 117–118, Revusky, Sam, 210 fixed interval reinforcement,
129–130, 131, 134, 235 reward learning, 51, 52 259, 336, 339
on category and causal rewards, 257–276 schedule-induced polydipsia, 439
learning, 229–231 Ainslie-Rachlin rule on, schedule of reinforcement, 256–260
on CS-US contingencies, 125– 268–269 concurrent, 260–261
127, 235 anticipation of, 376–396 continuous, 257, 384
on discrimination, 179, 180, 181 in bait and switch experiments, interval. See interval schedule
on extinction, 119–122, 128, 166 376 of reinforcement
on inhibition, 119–122, 128–129, contiguity theory of, 248 matching law on, 261–263, 264,
138–139, 166 delayed, 211, 255, 266–271 276, 284
limitations of, 128–130 expected value of, 379 minimum distance model on,
on short-term memory, 137, 138 extrinsic, harmful effects of, 283–284
research methods, 4 382–384 ratio, 258, 259–260, 451–452
on animal and human learning, and frustration, 380–382 schizophrenia, shaping behavior
25–27 and law of effect, 51, 52 in, 53
conditioning experiments in, in maze experiments, 249–252 Schull, Jonathan, 403
12–16 as motivation, 270, 376–396 scrub jays, episodic-like memory
operant experiments in, 18–19 negative effects of, 381–384 of, 332–333, 334
regulations on, 27 paradoxical effects of, 226–227, search image phenomenon, 295,
respondent, 19 381–384 325–326
response, 33–35, 445, 446 in partial reinforcement, search mode
in classical conditioning, 79, 384–387 in feeding behavior, 194–195
80–81 in positive reinforcement, 52 in sexual behavior, 195–196, 197
conditional, 81, 187–200. See also punishment by, 382–384 Sechenov, Ivan, 9
conditional response self-control and impulsiveness secondary reinforcers, 255
elicited, 19 in, 266–271 second-order conditioning, 83–84,
in information processing size of, 69–70, 266–271 216
approach, 23 timing of, 67, 268 selection by consequences, 284–288
in instrumental conditioning, value of, 263, 267–268, 446–449 self-control, 266–271
19, 29, 79–80 rG-sG mechanism, 379–380 hot and cold thoughts affecting,
and outcome relation, 446–452 rhesus monkeys, metacognition of, 270
in puzzle box experiments, 356–359 precommitment strategies in,
12–13 Romanes, George, 10 269
Skinner on, 17–18 rotational errors, 344–345, 346, self-generated priming, 136, 137,
unconditional, 80, 81, 187. See 353, 354 141
also unconditional response Rovee-Collier, Carolyn, 32–33 of conditional stimulus,
response deprivation hypothesis, running, preference of rats for, 139–140
282 279–280 in habituation, 141–142
response learning, 28, 245 running wheel experiments, avoid- in SOP model, 145
in children, 31, 32 ance behavior in, 418–419 of unconditional stimulus, 140
554  Subject index

The Selfish Gene (Dawkins), 42 as fixed action pattern, 45 “stamping-in” of S-R association,
Seligman, Martin, 207, 432–433 in parent–infant interactions, 13, 246, 276, 284, 287, 376, 379,
semantic memory, 332 50–51 440, 446, 447
sensitization, 49–50 smoking standard operating procedure
in classical conditioning, 95–97 behavior chain in, 256 model, 144–148. See also SOP
to drugs, 192, 193 contingency management and model
and systematic desensitization, incentive-based treatments standard pattern of affective dy-
171 of, 270, 271, 272 namics, 397–398
sensory aspects of reinforcers, 393, discounting delayed rewards startle response, 47, 48, 175–176
394 in, 271 habituation of, 403
sensory conditioning, AESOP impulsiveness and self-control potentiated, 186–187
model on, 149 in, 266, 270 Stetson, Lee, 139
sensory memory, 23, 136, 148–149 Pavlovian-instrumental transfer stimulus, 33–35, 445, 446
sensory nodes, 148–149 in, 390, 393 antecedent, 19
sensory preconditioning, 83, 84 shaping methods in treatment in behavior chain, 256
sensory registers, in information of, 53 categorization of, 295–305
processing approach, 23 snack foods, 373–374 in classical conditioning, 28–29,
sensory-specific satiety, 447 snakes, fear of, 73–74 63–64, 79, 80–81
sequential theory, 385–387 social attachment, emotions in, conditional, 80–81, 101–109. See
serial discrimination, 182, 184–185 399–401 also conditional stimulus
sexual behavior, 56–59, 195–197. social behavior of rats, food-relat- contiguity theory of, 247–249
See also reproductive behavior ed, 188, 194 discriminative, 254–255, 293,
Shanks, David, 230 Solomon, Richard, 396–397 452–453, 454
shaping behavior, 52–54, 72, 253 sometimes opponent process fading of, 254
and autoshaping, 68, 87–89, model, 144–148. See also SOP feature, 178
182–183, 222, 440–441 model generalization of response to,
timing of outcome in, 67 SOP model, 144–148, 153–154, 190, 81, 305–320
Sheffield, Fred, 277 236 guiding instrumental action,
Shettleworth, Sara, 72 affective extension of, 148–149, 293–361
short-term memory, 136–143 187, 391 in habituation, 48–50
of bees, 226 as elemental theory, 151 in information processing
capacity of, 137, 138 extinction in, 168 approach, 23
in information processing priming in, 145–147 intensity of, 94
approach, 23, 136–137, 320 replaced elements in, 152 motivating, dynamic effects of,
in taste aversion learning, space, cognition of, 343–355 396–405
212–213 spaced trials, 93 novelty of, 93–94
shuttle box experiments, 389, spatial behavior, 343–355 in operant experiments, 19
412–416, 418, 419, 421, blocking effect in, 350–352 and outcome relation, 452–454
427–428, 431 cognitive map in, 343, 349 prototype of, 304–305
Sidman avoidance, 416–417, 422 cues in, 343–346, 349, 350–353 pseudoconditioning to, 95–97
Siegel, Shepard, 60, 62, 188–189 geometric module in, 346, 353 relevance of, in long-delay
signal learning, 64–74 global representation of learning, 210–211
for food, 55–56 environment in, 353–355 in rG-sG mechanism, 380
preparedness in, 70–74 goals in, 350 sensitization to, 49–50, 95–97
sign stimuli, in fixed action pat- in maze experiments, 346–349 in sensory preconditioning, 84
terns, 44 place learning in, 346–347, and sign tracking, 63–64
sign tracking, 63–64, 87, 391, 395, 459–461 Skinner on, 17–18
441, 442, 445 rotational errors in, 344–345, substitution of, 81, 187–190
simultaneous conditioning, 91, 92 346, 353, 354 summation of, 99, 100, 101, 183
simultaneous discrimination, 182, specializations, adaptive, 221 target, 178
184, 185 species-specific defense reactions, unconditional, 80, 81. See also
situation, in puzzle box experi- 420–426, 427 unconditional stimulus
ments, 12–13 Spence, Kenneth, 252, 309 stimulus control, 253–254, 293–295
size of outcome, 69–70 spiders in fading procedure, 254
skin defense system, 218 cocoon-building behavior in, generalization gradient in, 307
Skinner, B. F., 17–19, 29, 247, 44–45 in lever pressing experiment,
252–254, 272, 276, 285, 437 fear of, 73–74, 170 253, 293
Skinner box experiments, 17, 18, spontaneous recovery, 166, 172 stimulus elements, 247–248
19, 28, 245, 252, 253 in extinction, 66, 169, 209 stimulus learning, 28, 159
slave processes, 399 in habituation, 48, 66 in children, 31, 32
smiling Staddon, John, 284 in drug use, 33
subject index   555

in eating and overeating, 31 and extinction, 167 taxon system in spatial behavior,
and response learning, 245–246, in fear conditioning, 82, 86, 87 349
440–442 and inhibition of delay, 339 temporal bisection procedure,
stimulus-outcome learning, 29, 34, in lever pressing and lifting, 444 338–339
35, 452–454 in LN–shock trials, 129 temporal context, 169
in eating and overeating, 31 in punishment, 67, 69 temporal generalization, 336–337
in Pavlovian conditioning, 245 in unblocking experiment, 105 terminal behaviors, 438–439
in punishment, 442–444 surprise territoriality in reproductive be-
and response-outcome of conditional stimulus, 138– havior, 56–59
learning, 245–246, 440–444, 139, 140, 142 Terry, W. S., 138
450–451, 454 information processing in, 137 Thaler, Richard, 272
stimulus-response learning, 81, Pearce-Hall model on, 133, 134 theoretical constructs, 20–22, 26
82, 83 priming reducing, 137 thiamine deficiency, 375
contiguity theory of, 247–249 of unconditional stimulus, thigmotaxis, 421, 426
and habit learning, 456–461 114–117, 121, 129, 138 thirst
reinforcement in, 246 sweet taste, 56 incentive learning in, 369–370
stimulus-response-outcome learn- swimming in water maze experi- motivation in, 369–370, 404
ing, 454–456 ments, 249, 347–349, 350–351 and schedule-induced
stimulus sampling theory, 248 symbol manipulation, 23 polydipsia, 439
stimulus-stimulus learning, 81–82 synthetic perspective of instru- as theoretical construct, 20, 21
in fear conditioning, 83 mental action, 411–464 Thorndike, Edward L., 12–13, 29,
in sensory preconditioning, 84 T 246, 252, 276
stimulus substitution, 81, 187–190 tabula rasa, 7, 241 threat conditioning, 86
Stockton, Leigh, 216 tandem schedule of reinforcement, Timberlake, William, 194
stress, controllability affecting, 260 time cognition, 335–343
434–435 target stimulus, 178 behavioral theory of, 342–343
structuralism, 14 taste, 55–56 circadian rhythm in, 335–336
substitution associated with calories, 56, information processing model
of reinforcers, 272–275 63, 64 of, 340–341
of stimulus, 81, 187–190 preferences for, 374 internal clock in, 340–342, 343
successive negative contrast ef- taste aversion learning, 8–9, 106, in interval timing, 336–339
fects, 227 206–220 in Sidman avoidance
sucking behavior of infants, 31–32 in alcohol use, 55, 73, 209 procedure, 417
sucrose solution, as reinforcer, bright-noisy water in, 71 time of day cues in, 335–336
279–280 in chemotherapy, 90, 96, 106 timing of outcome, 66–69
summation of excitors and inhibi- compound potentiation in, timing of reinforcement, schedule
tors, 99, 100, 101, 183 216–220 of, 256–260. See also schedule
Sunsay, Jay, 139 counterconditioning in, 172t of reinforcement
Sunstein, Cass, 272 extinction in, 209, 218 timing of stimulus presentation,
superposition in interval timing, hedonic shift in, 213–216 91–93, 147
337, 338, 339 interference in, 210, 212 Tinbergen, Niko, 43, 46
superstitious behaviors, 253, latent inhibition in, 212 tolerance
437–440 long-delay, 206, 209–211 to cold, 189
suppression odor in, 216–219 to drugs, 60–62, 193, 402–403
of appetitive performance, 391 one-trial, 208–209 Tolman, Edward C., 20–22,
in avoidance training, 415, 416 perceptual learning in, 315–316 249–252, 364, 459
in blocking and unblocking preexposure in, 211–212, Tony (dog), 10–11
experiments, 103, 104, 105, 118 315–316 touchscreens
and extinction, 167 preparedness in, 70–72 autoshaping of pecking on, 88
in fear conditioning, 86–87, 161, priming in, 140 in categorization experiments,
162, 415, 416 reinforcer devaluation effect in, 302, 304
and inhibition of delay, 339 446–447, 449–450 in delayed matching-to-sample
interval between CS and US revaluation of US in, 215–216 experiments, 356, 358
affecting, 91 safety in, 211–213 visual perception by pigeons of
in LN–shock trials, 104, 105, short-term memory in, 212–213 images on, 323
124, 129 stimulus relevance in, 210 trace conditioning, 91, 92, 146
renewal of, 167 taste-reactivity test in, 213–214 trace decay of memory, 163–164
suppression index, 162 in thiamine deficiency, 375 transfer tests, 300, 301, 453
suppression ratio timing of outcome in, 67–68 transposition in generalization
in avoidance training, 416 taste-reactivity test, 213–214 gradients, 312
in blocking experiment, 104 triadic design, 431
556  Subject index

Trichogaster trichopterus, territorial- memory of, 25, 137–138, video games, 228
ity and reproduction of, 56–58 160–161 visual perception
two-factor theory of avoidance nervous system response to, after-images in, 397
learning, 412–420 191–192 geons in, 321, 322
U novelty of, 93–94 McCollough effect in, 189
unblocking, 105, 114, 118, 129 in occasion setting, 183, 185 in pigeons, 295, 321–325
uncertain response, 358 overexpectation of, 123, 124 voluntary behavior in operant
unconditional response (UR), 80, in panic disorder, 198 experiments, 18–19, 253
81, 82, 187, 190–193 preexposure to, 94 W
compensatory, 191 priming of, 138, 140 Wagner, Allan, 106, 114, 137, 150,
in drug conditioning in pseudoconditioning, 95–97 151
experiment, 193 in relative validity experiments, AESOP model of, 148
nervous system in, 191–192 107 Rescorla-Wagner model of,
in panic disorder, 198, 199 in retardation-of-acquisition 114–130
unconditional stimulus (US), 80, test, 99, 100 SOP model of, 144–148
81, 82, 101–105 revaluation of, 215–216 warning signals in avoidance be-
activation of nodes, 144–148, in second-order conditioning, havior, 413, 415–420, 428, 429
168, 190 83, 84 Wasserman, Ed, 298
AESOP model on, 148–149 sensitization to, 95–97 water
in appetitive conditioning, sensory and emotional nodes, deprivation of, 279, 280,
89–90 148–149 369–370
in autoshaping of pecking in sexual behavior system, 197 need for, 371–372
behavior, 88–89 SOP model on, 144–148, 190 as reinforcer, 279
in blocking and unblocking surprise of, 114–117, 121, 129, water mazes, 249, 347–349,
experiments, 104–105, 133, 134, 138 350–351
117–118 in taste aversion learning, 90, Watson, John B., 15–16, 86
and conditional response, 190 206–207, 214–216 whelks, crows feeding on, 29–30,
and conditional stimulus time of presentation and 34, 51, 54
association, 115, 160, 452 strength of conditioning, Whitehead, Alfred North, 457
and conditional stimulus 91–93, 147 Whitlow, J. W., 141
contingencies, 101–103, unitization, perceptual learning in, withdrawal responses, 63–64, 404
125–127 316–317 within-compound association,
and conditional stimulus UR. See unconditional response 218–219
interval, 197 US. See unconditional stimulus Wolfsohn, Stefan, 14, 80
and conditional stimulus V working memory
pairings, 102–105, 115, 116 validity, relative, 106–109, 222–225, in impulsiveness and self-
conditional stimulus predicting, 230 control, 270
102–103, 114–115, 116, 117, value of rewards, 263, 267, 446–449 in information processing, 320,
124–125, 130, 131 in delay discounting, 267–268 326–332
context of, 125–126 expected, 379 in time cognition, 340, 341, 342
drugs as, 188–189, 193 in reinforcer devaluation effect, wrens, expert discrimination of, 313
in explicitly unpaired 446–449, 452, 457–459 Y
procedure, 98 variability in behavior, 363, 364 “yuck” response, 214, 449
in eyeblink conditioning, 85, variable interval schedule of rein- “yum” response, 214
138, 148–149 forcement, 258, 259
in fear conditioning, 83, 86–87 variable ratio schedule of rein- Z
in inhibition of delay, 98 forcement, 258 Zach, Reto, 29–30, 51, 54
inhibition of node activation, 168 variables, intervening, 20–21, 22 zebra finches, beak color prefer-
intensity of, 94, 119 verbal instructions, and electroder- ences of, 311
magnitude of, 94, 116, 119, 120, mal response, 215, 239 zeitgebers, 336
215–216 verbal interference, 172t Zentall, Thomas, 318
verbal rewards, 383 zero contingency, 125, 126, 127, 432
About the Book
Editor: Sydney Carroll
Production Manager: Christopher Small
Photo Researcher: David McIntyre
Book Design: Joanne Delphia and Beth Roberge Friedrichs
Cover Design: Beth Roberge Friedrichs
Book Layout: Beth Roberge Friedrichs
Illustration Program: Jan Troutt/Troutt Visual Services, LLC.
Cover and Book Manufacturer: R. R. Donnelley

You might also like