Operant Conditioning Theory
Operant Conditioning Theory
Conditioning
Theory by B.F.
Skinner
Prof. B.F. Skinner (b. 1904) started his research work on
behavior while he was a graduate in the Department of
Psychology of the Harvard University. In 1931 he wrote his
thesis entitled, “The concept of the reflex in the Description
of the behavior”. Skinner was a Practical Psychologist who
conducted several experiments on rats and pigeons.
His important publications are: ‘The Behavior of
Organism’ (1930), ‘Science and Human Behavior’ (1953),
Verbal behavior (1957), Cumulative Record (1957),
Beyond Freedom and Dignity (1971) and ‘About
Behaviorism’ (1974).
Introduction of Operant Conditioning
According to Skinner, there are two types of behaviors,
namely respondent behavior and operant behavior.
You blink your eye in response to a flash of light. This
reflexive behavior is elicited directly by the environment. So
this is respondent behavior - spontaneous response to stimuli.
But most of our behaviors are not so simply generated by the
environment. You are not forced by the environment to look at
a book, to talk, to sing, and to eat. These behaviors are emitted
by you, the individual. Through such behaviors, you operate
upon the environment. These are called operant behaviors
The Operant Experiment
Skinner designed a box named as ‘Skinner box’ and placed a
hungry rat inside.
There was a lever which, after being pressed, released a
mechanism to deliver a pellet of food to the rat.
Initially, the rat is engaged in a number of random behaviors
like walking, sniffing and scratching. None of these helped to
get the food.
At some point of time, the rat accidentally hit the lever and the
food was delivered. Of course, for the semi-starved rat, this
was a big reward.
Skinner observed that after a few accidental manipulations
of the lever, the rat started spending more time near the
lever, and then deliberately pressed the lever whenever it
was hungry.
So, now pressing the lever became a new operant for the rat.
Skinner further noted that if the pressing of the lever did not
deliver food any longer, the operant behavior by the rat
decreased and gradually stopped altogether.
This is known as experimental extinction of operant
conditioning.
For doing experiments with pigeons, Skinner made
use of another specific apparatus called “Pigeon
Box”.
A Pigeon in this experiment had a peck at a lighted
plastic key mounted on the wall at head height and
was consequently rewarded by receiving grain.
Measuring Operant Behavior
Quantification of operant behavior was crucial to Skinner’s
work. He needed to demonstrate that through appropriate use
of reward and punishment you can actually increase the
probability of occurrence of a conditioned operant behavior.
Therefore Skinner introduced the rate of occurrence of the
target behavior as the measure of operant conditioning.
He simply counted how many times the learnt behavior has
taken place within a given time. In fact, he used the cumulative
frequency of the operant behavior as the final indicator.
If you put it in a graphical form you will readily see whether
the probability of the occurrence of that behavior has actually
increased over time.
1. Shaping
Operations in
4. Concept of Operant 2.Extniction
Reinforcement Conditioning
3. Spontaneous
Recovery
1. Shaping
Shaping is an extremely important concept in operant
conditioning.
Shaping means modification of the organism’s behavior to the
experimenter’s desired end.
It takes place only through ‘successive approximations’.
Suppose you are trying to modify a child’s behavior by
selectively rewarding the response desired by you. Before the
ultimate desired behavior is enacted, the child’s usually
engaged in numerous other behaviors which may be considered
as steps to the final behavior. They are close to the target, but
not the target per se. If these approximate target behaviors are
rewarded, shaping is facilitated.
Skinner discovered this principle of successive
approximation rather accidentally.
He was conditioning a pigeon to swipe a ball with its beak
movement which in turn would release a food magazine.
The pigeon was not lucky enough. After waiting for the
accidental success to happen for a long time Skinner was
bored. So, just casually, he decided to reward any behavior
that might lead toward the target behavior.
As these approximate behaviors were successively rewarded,
to Skinner’s surprise, the total process was quickened.
Very soon ‘the ball was caroming off the walls of the box as
if the pigeon had been a champion squash player’ (Skinner,
1938, p. 38). Rewarding of the simpler step has automatically
led to the next higher step and so on. (This is successive
approximation)
Principles
involved in
Shaping
Habit
Generalization Chaining
Competition
1. Response
Generalization
2. Stimulus
Generalization
2. Habit 3. Chaining
1. Generalization Competition • Cues produced by one
• At each point of the response must be
Using the chain, the correct habit linked with the
experience and must attain dominance succeeding response.
knowledge of one over competing habits. • Example- if we want to
This is accomplished train the pigeon to
situation in another by reinforcing the move in circle then we
situation. correct habit alone. must reinforce every
correct turn .
1. Generalization
d) Variable
interval schedule
1. Continuous Reinforcement
In continuous reinforcement., the desired behavior
is reinforced every single time it occurs.
This schedule is best used during the initial stages
of learning in order to a strong association
between the behavior and the response.
Once the response if firmly attached,
reinforcement is usually switched to a partial
reinforcement schedule.
2. Partial Reinforcement
In partial reinforcement, the response is reinforced
only part of the time.
Learned behaviors are acquired more slowly with
partial reinforcement, but the response is more
resistant to extinction.
A) Fixed–ratio schedules are those where a response is
reinforced only after a specified number of responses. This
schedule produces a high, steady rate of responding with only a
brief pause after the delivery of the reinforce. An example of a
fixed-ratio schedule would be delivering a food pellet to a rat
after it presses a bar five times.
B) Variable-ratio schedules occur when a response is
reinforced after an unpredictable number of responses. This
schedule creates a high steady rate of responding. Gambling
and lottery games are good examples of a reward based on a
variable ratio schedule. In a lab setting, this might involve
delivering food pellets to a rat after one bar press, again after
four bar presses, and the third pellet after two bar presses.
C) Fixed-interval schedules: are those where the first
response is rewarded only after a specified amount of time
has elapsed. This schedule causes high amounts of responding
near the end of the interval but much slower responding
immediately after the delivery of the reinforce.
D) Variable-interval schedules occur when a response is
rewarded with an unpredictable amount of time has passed.
This schedule produces a slow, steady rate of response. An
example of this would be delivering a food pellet to a rat after
the first bar press following a one-minute interval, another
pellet for the first response following a five-minute interval,
and a third food pellet for the first response following a three-
minute interval.
Educational Implications
Identification of root cause of the behavior.
Eliminates Negative Behavior: The operant conditioning
theory involves the use of negative reinforcement which
strengthens behavior by eliminating unpleasant behavior.
By building operant conditioning techniques into lesson
plans, it is easily possible to teach children useful skills-as
well as good behaviors.
The use of reinforcement in the form of rewards motivates
children to keep learning and perform better.