0% found this document useful (0 votes)
10 views56 pages

Unit 4 - Learning

The document discusses various types of learning, including classical conditioning, operant conditioning, and cognitive learning. It explains how behaviors can be learned through associations between stimuli and responses, as well as the roles of reinforcement and punishment in shaping behavior. Key figures such as Ivan Pavlov and B.F. Skinner are highlighted for their contributions to the understanding of these learning processes.

Uploaded by

nirmit12344
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views56 pages

Unit 4 - Learning

The document discusses various types of learning, including classical conditioning, operant conditioning, and cognitive learning. It explains how behaviors can be learned through associations between stimuli and responses, as well as the roles of reinforcement and punishment in shaping behavior. Key figures such as Ivan Pavlov and B.F. Skinner are highlighted for their contributions to the understanding of these learning processes.

Uploaded by

nirmit12344
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 56

Unit 4- LEARNING

Types of Learning

Classical conditioning: Operant


learning to link two
stimuli in a way that conditioning:
helps us anticipate an changing behavior
event to which we have choices in response
a reaction to consequences

Cognitive learning: acquiring


new behaviors and information
through observation and
information, rather than by
direct experience
Associative Learning:
Classical Conditioning Stimulus 1: See
lightning
Stimulus 2: Hear
thunder
How it works: after repeated
exposure to two stimuli Here, our response to
occurring in sequence, we thunder becomes
associate those stimuli with associated with lightning.
each other.
Result: our natural response
to one stimulus now can be
triggered by the new,
predictive stimulus.
Associative Learning:
Operant Conditioning
 Child associates his “response” (behavior) with consequences.
 Child learns to repeat behaviors (saying “please”) which were
followed by desirable results (cookie).
 Child learns to avoid behaviors (yelling “gimme!”) which were
followed by undesirable results (scolding or loss of dessert).
Cognitive Learning

Cognitive learning refers to acquiring new behaviors


and information mentally by observing events and the
behavior of others, rather than by direct experience.
Behaviorism
 The term behaviorism was used by
John B. Watson (1878-1958), a proponent
of classical conditioning, as well as by
B.F. Skinner (1904-1990), a leader in
research about operant conditioning.
 Both scientists believed the mental life
was much less important than behavior as
a foundation for psychological science.
Ivan Pavlov’s Discovery

While studying salivation


in dogs, Ivan Pavlov
found that salivation
from eating food was
eventually triggered by
what should have been
neutral stimuli such
as:
 just seeing the food.
 seeing the dish.
 seeing the person who brought
the food.
 just hearing that person’s
footsteps.
Before Conditioning
Neutral stimulus:
a stimulus which does not trigger a response

Neutral
stimulus
(NS)
No response
Before Conditioning
Unconditioned stimulus and response:
a stimulus which triggers a response naturally,
before/without any conditioning
Unconditioned
response (UR):
dog salivates
Unconditioned
stimulus (US):
yummy dog food
During Conditioning
The bell/tone (N.S.) is repeatedly presented with the food
(U.S.).

Neutral Unconditioned
stimulus (NS) Unconditioned response (UR):
stimulus (US) dog salivates
After Conditioning
The dog begins to salivate upon hearing the tone (neutral
stimulus becomes conditioned stimulus).

Did you follow the


changes? Conditioned
Conditioned The UR and the CR are the response:
same response, triggered
(formerly by different events. dog salivates
neutral) The difference is
stimulus whether conditioning
was necessary for the
response to happen.

The NS and the CS are the


same stimulus.
The difference is
whether the stimulus
triggers the
conditioned
response.
Find the US, UR, NS, CS, CR in the following:

Your romantic partner always uses the


same shampoo. Soon, the smell of that
shampoo makes you feel happy.

The nurse says, “This won’t hurt a bit,”


just before stabbing you with a needle.
The next time you hear “This won’t hurt,”
you cringe in fear.
Higher-Order Conditioning
 If the dog becomes conditioned to salivate at the
sound of a bell, can the dog be conditioned to
salivate when a light flashes…by associating it
with the BELL instead of with food?
 Yes! The conditioned response can be
transferred from the US to a CS, then from there
to another CS.

This is higher-order conditioning: turning a NS


into a CS by associating it with another CS.
A man who was conditioned to associate joy
with coffee, could then learn to associate joy
with a restaurant if he was served coffee there
every time he walked in to the restaurant.
Acquisition
Acquisition refers to the initial
stage of learning/conditioning. 14
What gets “acquired”?
 The association between a neutral
stimulus (NS) and an unconditioned
stimulus (US).

How can we tell that an acquisition


has occurred?
 The UR now gets triggered by a CS
(drooling now gets triggered by a bell).

Timing
For the association to be acquired,
the neutral stimulus (NS) needs to
repeatedly appear before the
unconditioned stimulus (US)…about a
half-second before, in most cases. The
bell must come right before the food.
Acquisition and Extinction
 The strength of a CR grows with conditioning.
 Extinction refers to the diminishing of a conditioned
response. If the US (food) stops appearing with the CS
(bell), the CR decreases.
Spontaneous Recovery
[Return of the CR]
After a CR (salivation) has been conditioned and then extinguished:
following a rest period, presenting the tone alone might lead to a
spontaneous recovery (a return of the conditioned response
despite a lack of further conditioning).
ifthe CS (tone) is again presented repeatedly without the US, the
CR becomes extinct again.
Generalization and
Discrimination

Generalization refers to the tendency to have


conditioned responses triggered by related
stimuli.

Discrimination refers to the learned ability to


only respond to a specific stimuli, preventing
generalization.
Ivan Pavlov’s Legacy
John B. Watson and Classical
Conditioning: Playing with
Fear
 In 1920, 9-month-old Little Albert was not afraid of rats.
 John B. Watson and Rosalie Rayner then clanged a steel
bar every time a rat was presented to Albert.
 Albert acquired a fear of rats, and generalized this fear
to other soft and furry things.

 Watson prided
himself in his ability
to shape people’s
emotions. He later
went into
advertising.
Before Little Albert
Conditioning Experiment

No fear

NS: rat

UCS: steel bar hit


with hammer

Natural reflex:
fear
Little Albert
Experiment

UCS: steel bar hit


NS: rat with hammer

Natural reflex:
fear
During
Conditioning
Little Albert
Experiment
NS: rat

Conditioned
reflex:
fear

After
Conditioning
Operant Conditioning

How it works:
An act of chosen behavior (a “response”) is followed
by a reward or punitive feedback from the
environment.
Results:
Reinforced behavior is more likely to be tried again.
Punished behavior is less likely to be chosen in the
future.

Response: Consequence: Behavior


balancing a ball receiving food strengthened
Operant and Classical Conditioning are
Different Forms of Associative Learning
Classical Operant conditioning:
conditioning:
involves respondent behavior,  involves operant behavior,
reflexive, automatic reactions chosen behaviors which
such as fear or craving “operate” on the environment
 these behaviors become
 these reactions to associated with consequences
unconditioned stimuli (US) which punish (decrease) or
become associated with reinforce (increase) the
neutral (thenconditioned) operant behavior
stimuli
There is a contrast in the process of
conditioning.
The experimental (neutral) The experimental (consequence)
stimulus repeatedly precedes the stimulus repeatedly follows the
respondent behavior, and operant behavior, and eventually
eventually triggers that behavior. punishes or reinforces that
behavior.
B.F. Skinner: Behavioral
Control
B. F. Skinner saw potential
for exploring and using Edward
Thorndike’s principles much
more broadly. He wondered:
 how can we more carefully measure
the effect of consequences on chosen
behavior?

 what else can creatures be taught to


do by controlling consequences?
B.F. Skinner
trained pigeons to
 what happens when we change the play ping pong,
and guide a video
timing of reinforcement?
game missile.
B.F. Skinner: The Operant
Chamber
 B. F. Skinner, like Ivan Pavlov, pioneered more
controlled methods of studying conditioning.
 The operant chamber, often called “the Skinner
box,” allowed detailed tracking of rates of
behavior change in response to different rates of
reinforcement. Recording
device

Bar or lever
that an
animal
presses,
randomly at
first, later
for reward

Food/water
dispenser to
provide the
Reinforcement

 Reinforcement refers This meerkat has just


to any feedback from completed a task out
the environment that in the cold
makes a behavior more
likely to recur.
 Positive
reinforcement:
adding something
desirable (e.g., warmth) For the meerkat,
 Negative this warm light is
reinforcement: desirable.
ending/removing
something unpleasant
(e.g., the cold)
A cycle of mutual
reinforcement 28

Children who have a temper tantrum


when they are frustrated may get
positively reinforced for this behavior
when parents occasionally respond by
giving in to a child’s demands.
Result: stronger, more frequent
tantrums
Parents who occasionally give in to
tantrums may get negatively
reinforced when the child responds by
ending the tantrum.
Result: parents giving-in behavior
is strengthened (giving in sooner
and more often)
Discrimination

 Discrimination refers to the ability


to become more and more specific in
what situations trigger a response.
 Shaping can increase discrimination,
if reinforcement only comes for
certain discriminative stimuli.
 For examples, dogs, rats, and even
spiders can be trained to search for
very specific smells, from drugs to
explosives.
 Pigeons, seals, and manatees have
been trained to respond to specific Manatee that
shapes, colors, and categories. selects shapes
Shaping
 Imagine that you wanted to condition a hungry rat to press a
bar. Like Skinner, you could tease out this action with
shaping, gradually guiding the rat’s actions toward the
desired behavior.
 First, you would watch how the animal naturally behaves, so that you
could build on its existing behaviors.

 You might give the rat a bit of food each time it approaches the bar.

 Once the rat is approaching regularly, you would give the food only when
it moves close to the bar, then closer still.

 Finally, you would require it to touch the bar to get food. With this method
of successive approximations, you reward responses that are ever -
closer to the final desired behavior, and you ignore all other responses.
 By making rewards contingent on desired behaviors,
researchers and animal trainers gradually shape
complex behaviors.
How often should we
reinforce?
 Do we need to give a reward every single time? Or is
that even best?
 B.F. Skinner experimented with the effects of giving
reinforcements in different patterns or “schedules” to
determine what worked best to establish and maintain
a target behavior.
 In continuous reinforcement (giving a reward after
the target every single time), the subject acquires the
desired behavior quickly.
 In partial/intermittent reinforcement (giving
rewards part of the time), the target behavior takes
longer to be acquired/established but persists longer
without reward.
Different Schedules of Reinforcement

We may schedule
our
reinforcements
based on an
interval of time
that has gone by.

 Fixed interval
schedule: reward every
hour
 Variable interval
schedule: reward after a
changing/random amount
of time passes
Different Schedules of Reinforcement

We may plan for


a certain ratio
of rewards per
number of
instances of the
desired
behavior.

 Fixed ratio schedule:


reward every five targeted
behaviors
 Variable ratio schedule:
reward after a randomly
chosen instance of the
target behavior
Which Schedule of
Reinforcement is This? Ratio or
Interval? Fixed or Variable?
1. Rat gets food every third time it presses the lever
2. Getting paid weekly no matter how much work is done
3. Getting paid for every ten boxes you make
4. Hitting a jackpot sometimes on the slot machine
5. Winning sometimes on the lottery you play once a day
6. Checking cell phone all day; sometimes getting a text
7. Buy eight pizzas, get the next one free
8. Fundraiser averages one donation for every eight
houses visited
9. Kid has tantrum, parents sometimes give in
10. Repeatedly checking mail until paycheck arrives
Results of the different schedules of
reinforcement
 Fixed interval: slow,
unsustained responding
E.g.- If I’m only paid for
the work done on
Saturday, I’m not going Rapid Fixed interval
to work as hard on the responding
Rapid responding
Fixed interval

near
neartime for
time for
other days. reinforcement
reinforcement

 Variable interval: slow,


consistent responding Variable
interval Steady
E.g.- If I never know responding
which day my lucky
lottery number will pay
off, I better play it every
day.
Effectiveness of the ratio schedules
of Reinforcement
Fixed
 Fixed ratio: high rate of ratio
responding
E.g.- Buy two drinks, get Reinforcers
one free? I’ll buy a lot of
them!

 Variable ratio: high,


Variable ratio
consistent responding,
even if reinforcement stops
(resists extinction)
E.g.- If the slot machine
sometimes pays, I’ll pull
the lever as many times as
possible because it may
pay this time!
Operant Effect: Punishment
Punishments have the opposite effects of
reinforcement. These consequences make the
target behavior less likely to occur in the future.

+ Positive - Negative
Punishment Punishment
You ADD something You TAKE AWAY
unpleasant/aversive something pleasant/
(ex: spank the child) desired
(ex: no TV / Mobile)

Positive does not mean “good” or “desirable” and


negative does not mean “bad” or “undesirable.”
When is punishment
effective?
 Punishment works best in natural
settings when we encounter
punishing consequences from
actions such as reaching into a fire;
in that case, operant conditioning
helps us to avoid dangers.
 Punishment is effective when we try
to artificially create punishing
consequences for other’s choices;
these work best when consequences
happen as they do in nature.
Severity of punishments is not
as helpful as making the
punishments immediate and
certain.
Applying operant conditioning to
parenting
Problems with Physical
Punishment
Punished behaviors may restart when the
punishment is over; learning is not lasting.

 Instead of learning behaviors, the child may


learn to discriminate among situations, and
avoid those in which punishment might occur.

 Instead of behaviors, the child might learn


an attitude of fear or hatred, which can
interfere with learning. This can generalize
to a fear/hatred of all adults or many settings.

 Physical punishment models aggression


and control as a method of dealing with
Don’t think about the
beach

Don’t think about the waves,


the sand, the towels and
sunscreen, the sailboats and
surfboards. Don’t think about
the beach.
Are you obeying the
instruction? Would you obey
this instruction more if you
were punished for thinking
about the beach?
The Power of Rephrasing
 Positive punishment: “You’re
playing video games instead of
practicing the piano, so I am
justified in YELLING at you.”
 Negative punishment:
“You’re avoiding practicing, so
I’m turning off your game.”

 Negative reinforcement: “I
will stop staring at you and
bugging you as soon as I see
that you are practicing.”
 Positive reinforcement:
“After you practice, we’ll play a
game!”
Summary: Types of
Consequences
Adding stimuli Subtract stimuli Outcome
Positive + Negative – Strengthens
Reinforcement Reinforcement target
(Add Something (Remove Something behavior
Desirable) Undesirable) (You do
(You get candy) (I stop yelling) chores)
Positive + Negative – Reduces
Punishment Punishment target
(Add Something (Remove Something behavior
Undesirable) Desirable) (cursing)
(You get spanked) (No cell phone)
More Operant Conditioning
Applications

Parenting
1.Rewarding small improvements toward desired
behaviors works better than expecting complete
success, and also works better than punishing problem
behaviors.
2.Giving in to temper tantrums stops them in the short
run but increases them in the long run.

Self-Improvement
Reward yourself for steps you take
toward your goals. As you establish
good habits, then make your rewards
more infrequent (intermittent).
Role of Biology in
Conditioning
Classical Conditioning
 John Garcia and others found it was easier to
learn associations that make sense for
survival.
 Food aversions can be acquired even if the
UR (nausea) does NOT immediately follow the
NS. When acquiring food aversions during
pregnancy or illness, the body associates
nausea with whatever food was eaten.

 Males in one study were more likely to see a


pictured woman as attractive if the picture
had a red border.
Cognitive Processes
In classical In operant conditioning
conditioning
 When the dog salivates at the bell, it  In fixed-interval reinforcement,
may be due to cognition (learning to animals do more target
predict, even expect, the food). behaviors/responses around the
 Conditioned responses can alter time that the reward is more likely,
attitudes, even when we know the as if expecting the reward.
change is caused by conditioning.  Expectation as a cognitive skill is
even more evident in the ability of

humans to respond to delayed
However, knowing that our reactions reinforcers such as a paycheck.
are caused by conditioning gives us
the option of mentally breaking the
association, e.g. deciding that nausea  Higher-order conditioning can be
associated with a food aversion was enabled with cognition; e.g., seeing
actually caused by an illness. something such as money as a
 Higher-order conditioning involves reward because of its indirect value.
some cognition; the name of a food  Humans can set behavioral goals for
may trigger salivation. self and others, and plan their own
reinforcers.
Learning, Rewards, and
Motivation
 Intrinsic motivation refers to
the desire to perform a behavior
well for its own sake. The reward
is internalized as a feeling of
satisfaction.
 Extrinsic motivation refers to
doing a behavior to receive
rewards from others.
 Intrinsic motivation can
sometimes be reduced by
external rewards, and can be
prevented by using continuous
reinforcement.
 One principle for maintaining What might happen
behavior is to use as few rewards if we begin to
as possible, and fade the rewards reward a behavior
over time. someone was
already doing and
enjoying?
Learning by Observation
 We can learn new behaviors and skills without
conditioning and reward through-

 Observational learning: watching what


happens when other people do a behavior and
learning from their experience.

 Skills required:
 mirroring, being able to picture ourselves doing
the same action, and
 cognition, noticing consequences and
associations.
Learning by Observation

Observational Learning Processes


The behavior of others serves as a model, an
Modeling example of how to respond to a situation; we may
try this model regardless of reinforcement.
 Vicarious: experienced indirectly, through others
Vicarious  Vicarious reinforcement and punishment means
Conditioning our choices are affected as we see others get
consequences for their behaviors.
Albert Bandura’s Bobo Doll
Experiment (1961)
 Kids saw adults punching an inflated doll while narrating their
aggressive behaviors such as “kick him.”
 These kids were then put in a toy-deprived situation… and acted
out the same behaviors they had seen.
Mirroring in the Brain
 When we watch others doing or feeling
something, neurons fire in patterns that
would fire if we were doing the action or
having the feeling ourselves.
 These neurons are referred to as mirror
neurons, and they fire only to reflect the
actions or feelings of others.
From Mirroring to
 Imitation
Humans are prone to spontaneous imitation of both
behaviors and emotions (“emotional contagion”).
 This includes even overimitating, that is, copying
adult behaviors that have no function and no
reward.
 Children with autism are less likely to cognitively
“mirror,” and less likely to follow someone else’s
gaze as a neurotypical toddler (left) is doing below.
Mirroring Plus Vicarious
Reinforcement
 Mirroring enables observational learning; we
cognitively practice a behavior just by watching it.
 If you combine this with vicarious reinforcement,
we are even more likely to get imitation.
 Monkey A saw Monkey B getting a banana after
pressing four symbols. Monkey A then pressed the
same four symbols (even though the symbols were
in different locations).
Prosocial Effects of
Observational Learning
 Prosocial behavior
refers to actions
which benefit others,
contribute value to
groups, and follow
moral codes and
social norms.
 Parents try to teach
this behavior
through lectures, but
it may be taught
best through
modeling…
especially if kids can
see the benefits of
the behavior to
oneself or others.
Antisocial Effects of
Observational Learning
 What happens when we learn
from models who demonstrate
antisocial behavior, actions
that are harmful to individuals
and society?
 Children who witness violence
in their homes, but are not
physically harmed themselves,
may hate violence but still
may become violent more
often than the average child.
 Perhaps this is a result of “the
Bobo doll effect”? Under
stress, we do what has been
modeled for us.
Media Models of
Violence
Do we learn
antisocial
behavior
such as
violence
from indirect
observations
of others in
the media?

Research shows that viewing media violence leads to


increased aggression (fights) and reduced prosocial behavior
(such as helping an injured person).
This violence-viewing effect might be explained by imitation,
and also by desensitization toward pain in others.

You might also like