Pervasive Negative Effects of Rewards Intrinsic Motivation: The Myth Continues
Pervasive Negative Effects of Rewards Intrinsic Motivation: The Myth Continues
1 (Spring)
Most parents, educators, and behav- but performance and interest are main-
ior analysts would agree that the ideal tained only as long as the rewards keep
student is one who performs academic coming. In other words, rewards are
tasks at a high level, shows high inter- said to undermine intrinsic motivation.
est and involvement in school activi- This premise is based on the view that
ties, is willing to take on challenging when individuals like what they are
assignments, and is a self-motivated doing, they experience feelings of
learner. To instill interest and to height- competence and self-determination.
en student performance, many practi- When students are given a reward for
tioners implement reward and incen- performance, the claim is that they be-
tive systems in educational settings. In gin to do the activity for the external
recent years, the wisdom of this prac- reward rather than for intrinsic reasons.
tice has been debated in literature re- As a result, perceptions of competence
views, textbooks, and the popular me- and self-determination are said to de-
dia. Many writers and researchers crease and motivation to perform the
claim that giving students high grades, activity declines.
prizes, money, and even praise for en- Those who decry the use of rewards
gaging in an activity may be effective support their position by citing exper-
in getting students to perform a task, imental studies on rewards and intrin-
sic motivation conducted in social psy-
This work, was supported by a research grant chology and education. Since the
from the Social Sciences and Humanities Re- 1970s, dozens of experiments have
search Council of Canada. been designed to assess the impact of
Correspondence concerning this article should rewards on intrinsic motivation. A cur-
be addressed to Judy Cameron, Department of
Educational Psychology, 6-102 Education sory examination of the studies, how-
North, University of Alberta, Edmonton, Alber- ever, reveals a mixed set of findings.
ta, T6G 2G5 Canada. That is, in some studies, extrinsic re-
1
2 JUDY CAMERON et al.
eron and Pierce and Eisenberger and nitude of the relation (e.g., type of re-
Cameron reported minimal negative ef- ward, reward contingency). Conduct-
fects of tangible reward, whereas Deci ing a meta-analysis entails specifying
et al. found tangible rewards to be det- the criteria for including and excluding
rimental under a wide range of condi- studies, collecting all experiments that
tions. meet the criteria, and coding the stud-
Although the usefulness of meta- ies.
analysis and statistical testing in gen- Once all relevant studies are identi-
eral has been questioned by behavioral fied, the statistical results of each study
researchers (e.g., see Baron & are transformed into a measure called
Derenne, 2000; Derenne & Baron, an effect size. An effect size is found
1999), research summaries based on by converting the findings from each
meta-analyses have become valued study into a standard deviation unit. In
sources of information for both policy the rewards and intrinsic motivation
makers and researchers. Deci et al.'s literature, an effect size indicates the
(1999) meta-analytic finding of general extent to which the experimental group
negative effects of reward has impor- (rewarded group) and the control group
tant implications. Thus, to understand (nonrewarded group) differ in the
why the meta-analyses by Cameron means on measures of intrinsic moti-
and Pierce (1994) and Deci et al. re- vation (e.g., free time on task after re-
sulted in different findings, it is impor- wards are removed, task interest). In its
tant to be familiar with the technique simplest form, the effect size (g) is the
and logic of meta-analysis. The meta- difference between the means of the re-
analytic procedures described below warded group and the nonrewarded
are based on Hedges and Olkin (1985); control group divided by the pooled
these were the basic procedures used standard deviation of this difference. In
by Cameron and Pierce and by Deci et a meta-analysis, the effect size from
al. each study, rather than the individual
participants within a study, becomes
THE TECHNIQUE AND LOGIC the unit of analysis. If the effect sizes
OF META-ANALYSIS from all the studies present a random
pattern, they will hover around zero,
Meta-analysis is a technique for indicating no evidence for an effect.
combining the results of a large num- On the other hand, the effect sizes may
ber of studies on the same topic. It in- cluster in a positive or negative direc-
volves combining data from concep- tion, indicating that something is going
tually related studies to reach general- on.
izations based on statistical criteria. One problem in meta-analysis arises
Quantitative analyses, similar to meta- when studies do not provide enough
analysis, have been conducted on sin- information to calculate effect sizes.
gle-subject designs (e.g., see Kollins, When means and standard deviations
Newland, & Critchfield, 1997); how- are not available, effect sizes can be
ever, meta-analysis is typically used calculated from t tests, F statistics, and
with between-groups designs in which p values (see Hedges & Becker, 1986).
a treatment group (e.g., a rewarded However, in some cases, there may still
group) is compared to a control group be insufficient information to obtain an
(nonrewarded group) on a common de- effect size. The meta-analyst can con-
pendent measure (intrinsic motivation). tact the researchers and try to obtain
The goals of a meta-analysis are to es- the missing data. When the data cannot
tablish the relation between indepen- be procured, the study can be excluded
dent and dependent variables (in this from the analyses or assigned an effect
case, the relation between rewards and size of 0.00 (indicating no difference
intrinsic motivation) and to determine between experimental and control
what factors moderate or alter the mag- groups). It has been argued that includ-
6 JUDY CAMERON et al.
ing zero effect sizes is a conservative pothesis is that the effect sizes are ho-
strategy; if a significant effect is de- mogeneous (i.e., effect sizes in a given
tected in spite of the inclusion of zeros, analysis are viewed as values sampled
the contention is that the results would from a single population; variation in
not be altered if missing data were effect sizes among studies is merely
available (for a discussion of this issue, due to sampling variation). When Q is
see Light & Pillemer, 1984). On the statistically significant, the implication
other hand, if one's bias is toward no is that moderator analyses should be
effect (i.e., we are satisfied if the treat- conducted. The original set of studies
ment is not harmful), including zeros is then broken into subsets until the
favors this conclusion. One strategy for chi-square statistics within the sub-
dealing with this issue is to conduct the groups are nonsignificant. When the
analyses with zeros included and ex- researcher has exhausted potential
cluded. moderators and homogeneity is still
After effect sizes (g) are calculated not obtained, outliers (studies with ex-
for each relevant study, an overall treme effect-size values) are examined
mean effect size (d+) is obtained. independently and the analysis is con-
First, g is converted to d by correcting ducted with outliers removed.
for bias (g is an overestimation of the
population effect size, particularly for DIFFERENCES BETWEEN
small samples; see Hedges, 1981). The CAMERON AND PIERCE'S (1994)
overall mean effect size is obtained by AND DECI ET AL.'S (1999)
weighting each effect size by the recip- META-ANALYSES
rocal of its variance and averaging the
weighted d. This procedure gives more Although Deci et al. (1999) and
weight to effect sizes that are more re- Cameron and Pierce (1994) used the
liably estimated. The calculation of same meta-analytic procedures to eval-
mean effect sizes provides a signifi- uate the research on rewards and in-
cance test (whether the value differs trinsic motivation, their results dif-
significantly from 0.00) and a 95% fered. Cameron and Pierce conducted a
confidence interval (CI) (when the CI hierarchical meta-analysis of the re-
contains 0.00, the results suggest that wards and intrinsic motivation litera-
there is no evidence of a statistically ture. Studies were included if they had
significant effect). a rewarded group and a nonrewarded
In a hierarchical meta-analysis, all control group and if they used one of
studies are included in an overall anal- the two main measures of intrinsic mo-
ysis. The researcher then searches for tivation (free time on the task after the
moderator variables. The studies are reward was removed or self-reported
broken out by one key moderator, then task interest). The effects of reward on
another, and so on. The moderators that the two dependent measures (free time
the researcher chooses to examine may and task interest) were assessed sepa-
be based on theoretical considerations rately. When a study did not provide
or on differences between the studies enough information to calculate an ef-
(e.g., different procedures used in the fect size, it was not included in the
studies, different characteristics of the analyses.
samples used, year of publication, Cameron and Pierce (1994) were
etc.). first interested in whether rewards,
Hedges and Olkin (1985) recom- overall, produced negative effects on
mend using homogeneity tests to as- measures of intrinsic motivation. Their
certain whether a moderator analysis is findings indicated no overall negative
necessary. Essentially, the procedure is effects on either measure of intrinsic
to use a chi-square statistic, Q, with K motivation. However, the set of effect
- 1 degrees of freedom, where K is the sizes was significantly heterogeneous;
number of effect sizes. The null hy- thus, the researchers conducted a num-
THE MYTH CONTINUES 7
ber of moderator analyses to determine calculate effect sizes, Deci et al. im-
when and under what conditions re- puted effect sizes of 0.00 and included
wards produced negative effects. Re- these in each of their analyses.
wards were broken down by reward In terms of initial task interest, Deci
type (tangible and verbal). Tangible re- et al. (1999) noted that "the field of
wards were subdivided into expected inquiry has always been defined in
and unexpected, and expected tangible terms of intrinsic motivation for inter-
rewards were further separated by the esting tasks and the undermining phe-
reward contingency. Cameron and nomenon has always been specified as
Pierce used a behavioral framework to applying only to interesting tasks in-
categorize rewards by reward contin- sofar as with boring tasks there is little
gency; in addition, they used the cate- or no intrinsic motivation to under-
gories suggested by Deci and Ryan's mine" (p. 633). Given that cognitive
(1985) cognitive evaluation theory evaluation theory has little to say about
framework. Their results indicated the effects of rewards on low-interest
negative effects on the free-time mea- tasks, Deci et al.'s meta-analysis fo-
sure only when the rewards were tan- cused on reward effects on high-inter-
gible, expected, and not contingent on est tasks. Studies or conditions within
meeting a performance standard. The studies were included only if the tasks
same findings were reported by Eisen- used were measured or defined to be
berger and Cameron (1996), who car- initially interesting; studies or condi-
ried out some additional analyses of re- tions within studies were excluded if
ward contingencies. the tasks used were measured or de-
Deci et al. (1999) suggested that fined as initially uninteresting.
Cameron and Pierce's (1994) and Ei- Thus, Deci et al.'s (1999) meta-anal-
senberger and Cameron's (1996) fail- ysis began with the overall effects of
ure to detect more pervasive negative rewards on intrinsic motivation for
effects was due to methodological in- tasks of initial high interest only. Deci
adequacies. Specifically, they criticized et al. analyzed the effects of reward on
Cameron and Pierce and Eisenberger measures of self-reported task interest
and Cameron for the following: (a) col- and free-choice intrinsic motivation.
lapsing across tasks with high and low Their free-choice measure included
initial interest and omitting a modera- time spent on a task after rewards were
tor analysis of initial task interest, (b) removed. When a time measure was
including a study that used an inappro- not reported in a study, Deci et al. used
priate control group (Boal & Cum- measures of task persistence during the
mings, 1981), (c) omitting studies or free-choice period (e.g., number of tri-
data as outliers rather than attempting als initiated in a labyrinth game, num-
to isolate moderators, (d) omitting ber of balls played in a pinball game,
studies that were published during the number of successes on a task). Hence,
period covered by their meta-analysis, Deci et al.'s analysis of the free-choice
(e) omitting unpublished doctoral dis- measure was broader than the analysis
sertations, and (f) misclassifying stud- by Cameron and Pierce (1994), who
ies into reward contingencies as de- used only studies that assessed time
fined by cognitive evaluation theory. measures.
To rectify these issues in their recent On tasks of high initial interest, Deci
meta-analysis, Deci et al. (1999) ex- et al. (1999) found a significant nega-
cluded the study by Boal and Cum- tive effect of rewards on the free-
mings (1981), included studies that choice measure and a non-significant
were missed in the previous meta-anal- effect on the self-report measure. Both
yses, and included unpublished doctor- mean effect sizes were heterogeneous.
al dissertations. In addition, in contrast To obtain homogeneity at each level of
to Cameron and Pierce (1994), for analysis, Deci et al. tested a number of
studies with insufficient information to moderator variables. When homoge-
8 JUDY CAMERON et al.
neity could not be obtained, Deci et al. contingent rewards into studies of
followed the procedure used by Cam- "maximum" and "not-maximum" re-
eron and Pierce (1994) and identified ward. In studies of maximum reward,
and removed outliers. First, Deci et al. participants were offered rewards grad-
tested whether verbal versus tangible ed in terms of meeting a criterion or
rewards were a moderator. Verbal re- performance standard; all met the cri-
wards were found to increase free- terion and received the full amount of
choice intrinsic motivation for college reward. Six studies were identified by
students (a nonsignificant effect was Deci et al. as involving not-maximum
found for children) and to enhance task reward. In these studies, some partici-
interest for both children and college pants failed to attain the criterion and
students. Tangible rewards produced were given less than the maximum re-
negative effects on both the free-choice ward. Deci et al. reported that relative
and self-report measures. In accord to a nonrewarded control condition,
with Cameron and Pierce, tangible re- participants receiving less than the
wards were broken down into expected maximum reward showed a large de-
and unexpected rewards. Unexpected cline in free-choice intrinsic motiva-
rewards had no significant effects; ex- tion. In fact, the value (d = -0.88) was
pected tangible rewards were found to the largest mean effect size in their en-
significantly undermine both self-re- tire analysis.
ported task interest and free-choice in- As a supplemental analysis, Deci et
trinsic motivation. al. (1999) analyzed studies with chil-
Using cognitive evaluation theory as dren in which the free-choice assess-
their framework, Deci et al. (1999) fur- ment of high-interest activities was
ther subdivided expected tangible re- conducted immediately following the
wards into task-noncontingent, engage- removal of reward, within a week, and
ment-contingent, completion-contin- after a week. Deci et al. found negative
gent, and performance-contingent re- effects at each time of assessment and
wards. Task-noncontingent rewards suggested that the undermining effect
were "those given without specifically is not a transitory phenomenon. An ad-
requiring the person to engage in the ditional analysis of the effects of re-
activity" (p. 636); engagement-contin- wards on low-interest tasks was con-
gent rewards were those offered to par- ducted by Deci et al.; no statistically
ticipants for engaging in a task without significant effects were detected.
a requirement to complete the task, do All in all, Deci et al.'s (1999) meta-
it well, or reach some standard. Com- analysis produced numerous negative
pletion-contingent rewards were those effects of the various reward contin-
offered and given for completing a gencies. Given the discrepancies be-
task, and performance-contingent re- tween Deci et al.'s and Cameron and
wards were defined as those "offered Pierce's (1994) findings, it is important
dependent upon the participants' level to examine carefully the procedures
of performance" (p. 636). Deci et al. used by Deci et al. The first noteworthy
found no significant negative effects difference between the two meta-anal-
for task-noncontingent rewards; en- yses occurs at the level of all rewards.
gagement-contingent rewards produced Cameron and Pierce were interested in
significant negative effects on both assessing the overall effects of rewards
free-choice intrinsic motivation and across all types of tasks. Deci et al. did
self-reported task interest. Completion- not conduct this analysis; instead, they
contingent and performance-contingent argued that the more theoretically rel-
rewards also resulted in significant evant question concerned the effects of
negative effects on the free-choice in- rewards on tasks of high initial interest.
trinsic motivation measure. We contend that an analysis of the
In addition, Deci et al. (1999) pro- overall effect of reward is central to an
vided a breakdown of performance- understanding of this complex area of
THE MYTH CONTINUES 9
research. On a practical level, many a control group performed the task
educators, parents, and administrators without the offer of reward. In Deci et
have taken the position that rewards al.'s analyses, only one of the rewarded
and incentive systems are harmful. The groups was included. For other studies
view is that rewards negatively affect that used more than one level of re-
students' intrinsic interest across all ward magnitude (e.g., Earn, 1982;
types of activities (e.g., reading, math, McLoyd, 1979; Newman & Layton,
science, computer games, etc.); no dis- 1984), Deci et al. included all reward
tinction is made between low and high conditions. Their omission of certain
initial levels of task interest. Writers conditions within studies does not ap-
who caution against the use of rewards pear to be systematic (e.g., reward
and reinforcement frequently use ex- magnitude was not examined by Deci
amples to illustrate their point. More et al. as a moderator), yet there are a
often than not, activities such as read- number of different types of cases in
ing, lawn mowing, and mathematics which this occurs. In addition, as did
are cited as activities that people will Cameron and Pierce (1994), Deci et al.
lose interest in if they are given re- also missed a few experiments that met
wards for performing the activity. Most their inclusion criteria and that were
of these activities are not ones that in- published during the period covered by
dividuals begin doing with high levels their meta-analysis. Also, several stud-
of initial interest. Importantly, policy ies using high-interest tasks that mea-
makers who adopt the view that re- sured self-reported task interest were
wards are harmful rarely distinguish either excluded or inadvertently omit-
between high- and low-interest tasks. ted from Deci et al.'s analyses. Many
Because of this, an analysis of the of these studies found positive effects
overall effects of reward is warranted. on the self-report measure of task in-
It is our contention that a more com- terest; Deci et al.'s omission of these
plete hierarchical breakdown of the ef- effects helps to explain why they found
fects of rewards on intrinsic motivation either negative effects or no effects on
should begin at the level of all rewards the task-interest measure. A list of
over all types of tasks. Following this, studies not included in Deci et al.'s
a breakdown of reward effects on high- analyses, dependent measures that
and low-interest tasks would be appro- were precluded, and a description of
priate. conditions omitted by Deci et al. are
A further difficulty with Deci et al.'s presented in Appendix A. Any com-
(1999) meta-analysis concerns their putational differences in sample sizes
supplemental analysis of reward effects and effect sizes are also outlined in
on low-interest tasks. Several studies Appendix A.
that used low-interest tasks were ex- A final issue concerns the classifi-
cluded from their primary meta-analy- cation of studies into various reward
sis of high-interest tasks (e.g., Freed- contingencies. Deci et al. (1999) sug-
man & Phillips, 1985; Overskeid & gested that Cameron and Pierce (1994)
Svartdal, 1996). The problem is that miscategorized many experiments. Us-
these studies were not brought back ing cognitive evaluation theory to
into their supplementary analysis of guide their classification of studies,
low-interest tasks. Deci et al. established the categories of
Another concern is that for some task-noncontingent, engagement-con-
studies in their analysis of high-interest tingent, completion-contingent, and
tasks, Deci et al. (1999) omitted con- performance-contingent rewards. Al-
ditions that were relevant to their anal- though this categorization system may
yses. For example, in an experiment by be informative for cognitive evaluation
Wilson (1978), one group was offered theory, the problem is that the catego-
$0.50 to engage in the target activity, ries are too broad. Studies that used
a second group was offered $2.50 and very different procedures were pooled
10 JUDY CAMERON et al.
TABLE 1
Description of expected tangible reward contingencies
Reward contingency Description
Task noncontingent Reward is offered for agreeing to participate, for coming to the
study, or for waiting for the experimenter.
Offer of reward is unrelated to engaging in the task.
Rewards offered for doing well Reward is offered for doing well on the task or for doing a
good job.
No specification is given as to what it means to do a good job
or to do well.
Rewards offered for doing a Reward is offered to engage in the experimental activity.
task No instructions are given about how well participants must
perform or whether they must complete the task.
Rewards offered for finishing Reward is offered to finish an activity, to complete a task, or
or completing a task to get to a ceratin point on the task.
The reward is not related to quality of performance.
Rewards offered for each unit Reward is offered for each unit, puzzle, problem, etc., that is
solved solved.
Rewards offered for surpassing Reward is offered for surpassing a particular specified score
a score (absolute standard).
In some cases, the better the score, the higher the reward.
Rewards offered for exceeding Reward is offered to meet or exceed the performance of others
a norm on the task (relative standard).
ward delivery, and wrote down what enough information to code the contin-
was said to participants and how the gency (e.g., Chung, 1995; Hom, 1987).
reward was delivered. We then orga- In addition, a few studies used a con-
nized the studies into seven main cat- tingency that did not fit into any of the
egories of reward contingency: rewards seven categories; for example, W. E.
delivered regardless of task involve- Smith (1975) offered rewards to partic-
ment (task noncontingent); rewards ipants for showing signs of learning.
given for doing a task; rewards for do- These studies were included in overall
ing well; rewards for finishing or com- analyses, but were omitted from the
pleting a task; rewards given for each analysis of reward contingencies. A list
problem, puzzle, or unit solved; re- of the studies used in each analysis, a
wards for achieving or surpassing a description of reward type, reward ex-
specific score; and rewards for meeting pectancy, and reward contingency, to-
or exceeding others. Although all stud- gether with effect sizes are presented
ies were coded for reward contingency, in Appendixes C through G.
it was at the level of expected (offered) To ensure reliability of coding, the
tangible reward that it became neces- second author was given the definitions
sary to analyze studies in the various for each contingency (Table 1) and a
reward contingencies. Other analyses sample of 32 studies to code (each of
resulted in homogeneity, and further the studies involved expected tangible
breakdowns were not required. In Ta- rewards). Reliability calculated as per-
ble 1, we provide definitions and de- centage agreement was 97% (31 of 32
scriptions for each of these contingen- studies). One study (L. W. Goldstein,
cies at the level of expected tangible 1977) included a condition in which
reward. A comparison of our reward participants were offered a reward to
contingencies and those of Deci et al. take pictures. The issue was whether
(1999) is presented in Appendix B. We this contingency involved reward sim-
return to a discussion of these compar- ply for doing the task or for finishing
isons in our results section. the task. The third author was brought
In some studies, there was not in to code the study; he pointed out
THE MYTH CONTINUES 13
that participants in the reward condi- questing the missing data. Eight people
tion were not required to complete or replied; six could not locate the data,
finish the task to obtain the reward and and two provided us with data for stud-
that Goldstein stated that "the reward ies by Wicker, Brown, Wiehe, and
did not imply that the subject had done Shim (1990) and by Dollinger and
well on the task, only that s/he had en- Thelen (1978). When we could not ob-
gaged in it" (p. 30). Hence, the reward tain missing data, we imputed an effect
contingency was classified as a reward size of 0.00. Each analysis was con-
offered for doing the task. ducted with zeros included and exclud-
Finally, we identified studies that in- ed. In accord with Deci et al. (1999),
volved maximum or less than maxi- we report the analyses with the zeros
mum reward. Such studies involved of- included; however, when mean effect
fering participants a reward for doing sizes were altered to any extent by the
well, for finishing a task, for each inclusion of zeros, we report the anal-
problem or unit solved, for surpassing ysis with and without zeros.
a score, or for exceeding a norm. Stud- Eisenberger, Pierce, and Cameron
ies were considered to produce the (1999) pointed out that there were two
maximum reward if participants in the possible types of control comparisons
reward condition met the performance for some of the studies labeled perfor-
requirements and received the full re- mance contingent by Deci et al. (1999).
ward. Less than maximum reward oc- In some studies, the control group was
curred when there was a time limit told the performance objectives and was
such that some participants were un- given performance feedback (complete
able to meet all the requirements in the control); in others, the control group was
time allotted and were given less than not told a performance objective and no
the full reward. For example, Deci's feedback was given (partial control). Ei-
(1971) experiment involved less than senberger, Pierce, and Cameron exam-
maximum reward. Participants were ined differences between these two types
offered $1.00 for each of four puzzles of comparisons (reward vs. partial con-
solved within a 13-min time limit. Not trol, reward vs. complete control). One
all participants were able to solve the small difference was detected on the
puzzles within the time limit and did free-choice measure. When rewards
not receive the full reward. were offered to exceed others, reward
versus a partial-control condition result-
Calculation and Analysis of ed in a nonsignificant positive effect; the
Effect Sizes mean effect for reward versus a com-
plete control was significantly positive
After all studies were coded, we cal- (no other comparisons resulted in differ-
culated effect sizes (g) for each com- ences). Because this difference was
parison of a rewarded group to a non- small and both mean effects were in the
rewarded group on the free-choice and same direction, we included studies with
self-report measures of intrinsic moti- either type of control condition in the
vation. Positive effect sizes indicate present analyses. If a study contained
that rewards produced an increase in both types of controls (e.g., Harackiew-
measures of intrinsic motivation rela- icz, Manderlink, & Sansone, 1984), one
tive to a control group, negative effect effect size was calculated comparing the
sizes denote a decrease, and an effect reward condition to both controls.
size of 0.00 indicates no difference. In accord with Deci et al. (1999) and
When there was not enough informa- with our previous procedures, more
tion to calculate an effect size, we at- than one effect size was calculated for
tempted to contact the researchers. several studies in our analyses. For ex-
From a list of 22 researchers, we were ample, if a single study assessed free
able to locate E-mail addresses for choice and used two types of expected
nine. E-mail messages were sent re- tangible rewards (e.g., rewards offered
14 JUDY CAMERON et al.
TABLE 2
Hierarchical analysis of the effects of rewards on measures of
intrinsic motivation
Analysis of the effects of reward K N d+ 95% CI
Free-choice intrinsic motivation
All rewarda 115 8,176 -0.08 -0.12, 0.02
Low initial task interest 12 429 0.28* 0.07, 0.47
High initial task interesta 114 7,888 -0.09* -0.14, -0.04
Verbal reward 25 1,374 0.31* 0.20, 0.41
Tangible rewarda 102 6,942 -0.17* -0.22, -0.12
Unexpected reward 9 375 0.02 -0.18, 0.22
Expected reward (offered)a 101 6,703 -0.18* -0.23, -0.13
Self-reported task interest
All rewarda 100 8,028 0.12* 0.07, 0.16
Low initial task interest 11 503 0.12 -0.06, 0.30
High initial task interesta 98 7,547 0.12* 0.07, 0.17
Verbal rewarda 24 1,584 0.32* 0.22, 0.43
Tangible rewarda 83 6,354 0.08* 0.03, 0.13
Unexpected reward 5 299 0.03 -0.20, 0.26
Expected reward (offered)a 81 6,138 0.08* 0.03, 0.13
Note. K = number of studies; N = total sample size; d+ = mean weighted effect size; 95% CI
-
95% confidence interval.
aThe sample of effect sizes was significantly heterogeneous.
*p < .05.
for doing the task and rewards offered the present analyses were run on the
for surpassing a certain score) plus a computer program Meta (Schwarzer,
control group, two effect sizes were 1991) using the weighted integration
calculated. Each individual effect size method described in our section on
was entered into the relevant analysis meta-analytic procedures. The program
(expected tangible rewards for doing a converts effect size, g, to d; mean
task, expected tangible rewards for sur- weighted effect size (d+) is obtained;
passing a score). For the analyses of 95% CI is constructed around the
expected tangible reward, tangible re- means, and a homogeneity statistic, Q,
ward, and all reward, one effect size is computed.
was calculated (the two groups were
compared to the control group) and en- RESULTS OF OUR
tered into the overall analyses. This META-ANALYSIS
strategy satisfies the independence as-
sumption of meta-analytic statistics In Table 2, we present the results for
(Hedges & OLkin, 1985) and gives our meta-analysis up to the level of re-
equal weight to each study analyzed. ward contingency. Table 2 presents
Thus, subcategories (e.g., rewards of- mean weighted effect size (d+) and 95%
fered for doing the task, for doing well, CI for each analysis. Mean effects are
etc.) may contain more effect sizes considered statistically significant when
than the superordinate category (ex- the CI does not include zero. In the pres-
pected tangible reward). For example, ent meta-analysis, positive effect sizes
for all reward on the free-choice mea- indicate that reward produces increases
sure (over both high- and low-interest in intrinsic motivation, negative effect
tasks), there were 126 effect sizes, but sizes support the claim that rewards
only 115 of these are independent (sev- undermine intrinsic motivation, and
eral are within the same study). zero effects indicate no evidence for an
After all effect sizes were calculated, effect of reward. According to J. Cohen
THE MYTH CONTINUES 15
(1988), an effect size of ±0.20 is con- with low-interest tasks included an ex-
sidered small, ±0.50 is moderate, and pected tangible reward condition; com-
greater than ±0.80 is large. pared with a nonreward control, the
mean effect was significantly positive
All Rewards (d+ = 0.26, CI = 0.06, 0.45). Nine
First, the overall effects of reward for doinginvolved
studies
the
offering the reward
were analyzed across all conditions measure the effect on the free-choice
task;
and across high- and low-interest tasks. (d+ = 0.26, CI = 0.03, remained significant
On the free-choice measure, Table 2 in- reported task interest, no0.48). For self-
significant ef-
dicates that there was no significant ef- fects were found under any of the con-
fect (d+ = -0.08, CI = -0.12, 0.02).
On the measure of self-reported task ditions. In Deci et al.'s (1999) supplemental
interest, a small significant positive ef- analysis of low-interest tasks (p. 651),
fect was detected (d+ = 0.12, CI = fewer studies were included and no
0.07, 0.16). This analysis was not con- significant effects found on either
ducted by Deci et al. (1999); therefore, the free-choice or were the self-report mea-
the findings cannot be compared. The sures of intrinsic motivation.
results are, however, in accord with
those of Cameron and Pierce (1994). The
Effects of Rewards on
On both the free-choice and self-report High-Interest
measures, however, the sets of studies Tasks
were significantly heterogeneous, sug- For high-interest tasks, the mean ef-
gesting the necessity of a moderator fect size on free choice (Table 2)
analysis. Thus, at the next level of showed a small but significant negative
analysis, we divided studies into those effect (d+ = -0.09, CI = -0.14,
with low- and high-interest tasks. -0.04); the set of effect sizes, how-
ever, was heterogeneous. The mean ef-
The Effects of Rewards on fect size for self-reported task interest
Low-Interest Tasks was significant, small, but in a positive
When reward effects were analyzed the direction (d+ = 0.12, CI = 0.07, 0.17);
sample of effect sizes was also het-
for tasks with low initial interest, Table erogeneous. Deci et al. (1999) also re-
2 shows a statistically significant pos- ported a significant
itive effect on the free-choice measure the free-choice measure negative effect on
(d+ = 0.28, CI = 0.07, 0.47); there nificant effect on the task-interest but a nonsig-
was no significant effect on self-re- sure. As mea-
noted, Deci et al. omitted
ported task interest (d+ = 0.12, CI = missed several self-report effect sizes.or
-0.06, 0.30). These findings indicate
that when a task is not initially inter- Verbal Rewards
esting, rewards enhance free-choice in-
trinsic motivation but not verbal ex- Verbal rewards were found to sig-
pressions of task interest. nificantly enhance both free-choice in-
Although the studies in this analysis trinsic motivation (d+ = 0.31, CI =
were considered homogeneous (i.e., Q 0.20, 0.41) and self-reported task inter-
was not significant), we examined est (d+ = 0.32, CI = 0.22, 0.43).
whether there were any differences These results were also obtained by
among different types of rewards, ex- Deci et al. (1999), who reported similar
pectancies, and contingencies. On the small to moderate positive effects of
free-choice measure, only one study in- verbal rewards.
cluded a condition that used a verbal On the free-choice measure, the set
reward (the effect was positive). For of effect sizes was homogeneous, sug-
tangible reward, one study included an gesting that no further breakdowns
unexpected reward condition (the ef- were necessary. In most studies of ver-
fect was positive). All of the 12 studies bal reward, the rewards were unex-
16 JUDY CAMERON et al.
pected and the mean effect was posi- used in other studies; Deci et al. termed
tive; a positive effect was also found this "controlling" reward. When the
with the five studies that used expected outliers were removed from the anal-
rewards. In addition, verbal rewards ysis of verbal rewards on the task-in-
were generally delivered simply for terest measure, the set of studies was
doing a task and were not contingent homogeneous and the mean effect re-
on any specific level of performance mained significantly positive (K = 21,
(again, the effects were positive). N = 1,194, d+ = 0.32, CI = 0.21,
When the effects of verbal reward on 0.44). In this data set, there were six
free choice were examined with chil- studies that did not provide enough in-
dren versus adults (mainly college stu- formation to obtain an estimate of ef-
dents), children showed a smaller pos- fect size (these studies were given an
itive effect (K = 10, N = 320, d+ = effect size of 0.00). When these studies
0.22, CI = 0.04, 0.39) than adults (K were removed, the mean effect size for
= 15, N 844, d+ = 0.36, CI = 0.22, task interest showed a slight increase
0.49). Deci et al. (1999) also reported (K = 15, N = 981, d+ = 0.40, CI =
a larger effect for adults but a nonsig- 0.27, 0.53).
nificant effect for children (our effect
size for children was statistically sig- Tangible Rewards.
nificant because we included more
studies than Deci et al.). When the effects of tangible rewards
On the task-interest measure, the set on high-interest tasks were analyzed,
of effect sizes for verbal reward was Table 2 shows a small significant neg-
significantly heterogeneous. We con- ative effect on the free-choice measure
ducted moderator analyses of children (d+ = -0.17, CI = -0.22, -0.12) and
versus adults and expected versus un- a small significant positive effect on
expected reward. Mean effect sizes for self-reported task interest (d+ = 0.08,
each of these analyses remained sig- CI = 0.03, 0.13). Both of these sam-
nificantly positive, but homogeneity ples of effect sizes were significantly
was still not obtained. In almost all heterogeneous and required a further
studies, the rewards were given for do- moderator analysis.
ing the task; hence, this reward contin- Reward expectancy. Tangible re-
gency could not be a moderator. wards were subdivided into unexpect-
To obtain homogeneity, three studies ed (rewards delivered without a state-
were removed from the analysis (the ment of the contingency) and expected
same outliers were removed by Deci et (rewards delivered after a statement of
al., 1999). Inspection of the outliers in- contingency) categories. No significant
dicated that two of the studies (Butler, effects were detected for unexpected
1987; Vallerand, 1983) produced large tangible rewards (see Table 2), and the
positive effects; these studies did not samples were homogenous (Deci et al.,
differ in obvious ways from other stud- 1999, reported similar findings). Ex-
ies in the sample except for their ten- pected tangible rewards produced a
dency to generate extreme values of ef- negative effect on the free-choice mea-
fect size. The third outlier (Kast & sure (d+ = -0.18, CI = -0.23,
Connor, 1988) produced a negative ef- -0.13) and a positive effect on the
fect (-0.46). Kast and Connor com- self-report measure (d+ = 0.08, CI =
pared control participants to partici- 0.03, 0.13), but both of these samples
pants who were praised for their per- were significantly heterogeneous.
formance on the task as well as to an- Reward contingency. For the next
other group who were also praised but level of analysis, expected tangible re-
who were told that they should be do- wards were subdivided into various re-
ing well. The second verbal reward ward contingencies. Results of our
condition produced a negative effect analysis on the free-choice measure are
and was different from verbal reward presented in Figure 1. No significant
THE MYTH CONTINUES 17
FREE CHOICE
High Interest Tasks Expected (offered
tangible rewardsj
Figure 1. The effects of expected tangible reward contingencies on free-choice intrinsic motivation
under high levels of initial task interest. K = number of studies, N = total sample size, d+ = mean
weighted effect size; statistically reliable effect sizes are marked with an asterisk (*p < .05, **p <
.01). Positive effect sizes indicate higher intrinsic motivation for rewarded versus control groups;
negative effect sizes indicate lower intrinsic motivation for rewarded groups. Numbers in parenthe-
ses represent 95% confidence intervals. All effect sizes are based on homogeneous samples.
effects were detected when the rewards Figure 1. Two of the outliers produced
were task noncontingent, were offered positive effects; the only differences
for finishing or completing a task, or between these two studies and the bulk
were offered for attaining or surpassing of studies were that the study by Tri-
a score. Figure 1 shows significant pathi and Agarwal (1988) was con-
negative effects when the rewards were ducted in India and the study by Bren-
offered for doing a task, for doing well nan and Glover (1980) was designed to
on a task, and for each unit solved. A assess the effects of rewards when the
significant positive effect was found rewards were shown to function as re-
when the rewards were offered for inforcement. Other outliers (Chung,
meeting or exceeding the performance 1995; Danner & Lonkey, 1981; Fabes,
level of others. Eisenberg, Fultz, & Miller, 1988; Mor-
When rewards were offered for do- gan, 1983, Experiment 1; Okano, 1981,
ing a task, the effect was significantly Experiment 2) had large negative ef-
negative (K = 57, N = 2,910, d+ = fects but there was no common factor
-0.35, CI = -0.43, -0.27) but not ho- that could explain their extreme values.
mogeneous. Although we searched for Our findings for free choice indicate
moderators (salient vs. nonsalient re- that when reward contingency is de-
ward, children vs. adults, and time of fined in terms of experimental proce-
reward delivery), analyses of these var- dures, negative, neutral, and positive
iables did not result in homogeneous effects are obtained. Using cognitive
samples. As a result, outliers were evaluation theory as their framework
identified and omitted. The mean effect for the categorization of reward contin-
with outliers removed is presented in gencies, Deci et al. (1999) found neg-
18 JUDY CAMERON et al.
Figure 2. A comparison of our findings with Deci et al.'s (1999) effects of expected tangible
reward contingencies on free-choice intrinsic motivation for high-interest tasks. Deci et al.'s cate-
gories of completion-contingent and performance-contingent reward contained studies that involved
"reward offered for each unit solved."
ative effects for all but task-noncontin- a variety of reward procedures having
gent rewards. One way to understand different effects.
these differences is to compare Deci et Our results for the task-interest data
al.'s effects and definitions of contin- are presented in Figure 3. The analysis
gencies with our effects and procedural shows no significant effect for task-
definitions. Figure 2 shows this com- noncontingent rewards, a small signif-
parison and indicates that Deci et al.'s icant negative effect for rewards of-
completion-contingent and perfor- fered for doing, and significant positive
mance-contingent rewards consisted of effects for each of the other contingen-
THE MYTH CONTINUES 19
SELF-REPORTED
TAS K I NTER E ST |Expected (offered)
High Interest Tasks
Figure 3. The effects of expected tangible reward contingencies on self-reports of task interest
under high levels of initial task interest. K = number of studies, N = total sample size, d+ = mean
weighted effect size; statistically reliable effect sizes are marked with an asterisk (*p < .05, **p <
.01). Positive effect sizes indicate higher intrinsic motivation for rewarded versus control groups;
negative effect sizes indicate lower intrinsic motivation for rewarded groups. Numbers in parenthe-
ses represent 95% confidence intervals. All effect sizes are based on homogeneous samples.
cies. In the analysis of rewards offered number of positive effects for studies
for doing, 14 studies were given effect that would be included in these cate-
sizes of 0.00; when these studies are gories. As discussed previously, many
removed from the analysis, the nega- studies with self-report measures were
tive effect increased from -0.13 to not included in Deci et al.'s analyses
-0.22 (K = 24, N = 1,201, d+ = (see details in Appendix A).
-0.22, CI = -0.33, -0.10). Maximum versus less than maximum
In the analysis of rewards offered for reward. On the free-choice measure of
each unit completed, when all studies intrinsic motivation, there was only
were included the effect was positive (K one reward contingency (rewards of-
= 22, N = 1,161, d+ = 0.19, CI = 0.08, fered per unit solved) that allowed a
0.31) but significantly heterogeneous. comparison between maximum and
Two studies (Kruglanski et al., 1975, less than maximum reward. For other
Experiment 1; Wimperis & Farr, 1979) reward contingencies, most studies in-
had a large positive effect size; when volved maximum reward; a compari-
these studies were omitted, homogeneity son with less than maximum reward
was attained (Figure 3 presents the data would be unreliable. When rewards
for homogenous samples). were offered for each unit solved, the
In Figure 4, we compare Deci et al.'s findings showed nonsignificant effects
(1999) findings and reward contingen- for studies of maximum rewards (K =
cies with ours. For completion-contin- 6, N = 345, d+ = -0.03, CI = -0.25,
gent and performance-contingent re- 0.18) and a significant negative effect
wards, Deci et al. found no significant for studies of less than maximum re-
effects, whereas our findings show a ward (K = 14, N = 749, d+ = -0.22,
20 JUDY CAMERON et al.
Figure 4. A comparison of our findings with Deci et al.'s (1999) effects of expected tangible
reward contingencies on self-reports of task interest for high-interest tasks. Deci et al.'s categories
of completion-contingent and performance-contingent reward contained studies that involved "re-
ward offered for each unit solved."
CI = -0.37, -0.07). These two sets of cies had too few studies that used less
effect sizes were homogeneous. These than maximum reward. For studies in-
results suggest that the negative effect volving the offer of reward for each
of pay per unit is associated with par- problem solved, there were too few ex-
ticipants receiving less than maximum periments of maximum reward (see
rewards. Appendix F).
No analyses were conducted on dif-
ferences between maximum and less DISCUSSION
than maximum rewards on the self-re- A major issue in psychology and ed-
port measure. Most of the contingen- ucation is that rewards and reinforce-
THE MYTH CONTINUES 21
ment have a detrimental effect on in- sive; rewards have different effects un-
trinsic motivation. The concern is that der different moderating conditions.
if people receive reinforcement or re- Figure 5 shows the effects of differ-
wards for activities they already enjoy, ent moderating conditions. The effects
they will be less motivated to engage of rewards on free-choice intrinsic mo-
in those activities than they were prior tivation and self-reported task interest
to the introduction of reward once the are presented only for homogeneous
rewards are no longer forthcoming. In subsets. When a result was heteroge-
other words, rewards and reinforce- neous, we broke down the subset of ef-
ment are said to decrease intrinsic mo- fect sizes by different moderator vari-
tivation. Since the 1970s, over 100 ables until homogeneity was attained.
studies have been conducted to assess A positive effect indicates that rewards
the effects of reward on intrinsic mo- enhanced the measure of intrinsic mo-
tivation. The vast majority of studies tivation relative to a control condition,
on the topic have employed between- a negative effect indicates a decrease
groups statistical designs. Rewarded for the rewarded group, and a zero ef-
participants are compared to nonre- fect indicates no significant effect.
warded controls. Intrinsic motivation is The effects of all rewards are first
measured by the difference between broken into high- and low-interest
groups on task interest and free choice tasks. When the tasks used in the stud-
(time on task and performance on task ies are of low initial interest, rewards
once the rewards are removed). A increase free-choice intrinsic motiva-
meta-analysis of this experimental lit- tion and leave task interest unaffected.
erature by Cameron and Pierce (1994) This finding indicates that rewards can
and Eisenberger and Cameron (1996) be used to enhance time and perfor-
found limited negative effects of re- mance on tasks that initially hold little
wards, whereas a more recent analysis enjoyment. As Bandura (1986) recog-
by Deci et al. (1999) showed pervasive nized, "Most of the things people en-
negative effects. The meta-analysis joy doing for their own sake had little
presented in this article was designed or no interest for them originally. ...
to correct flaws in the previous reviews But with appropriate learning experi-
and to resolve differences. ences, almost any activity ... can be
imbued with consuming significance"
A Summary of Our Findings (p. 241). Our results suggest that re-
ward procedures are one way to culti-
In terms of the overall effects of re- vate interest in an activity. In educa-
ward, our meta-analysis indicates no tion, a major goal is to instill motiva-
evidence for detrimental effects of re- tion and enjoyment of academic activ-
ward on measures of intrinsic motiva- ities. Many academic activities are not
tion. This finding is important because of high initial interest to students. An
many researchers and writers espouse implication of our finding is that re-
the view that rewards, in general, re- wards can be used to increase perfor-
duce motivation and performance. In mance on low-interest academic activ-
addition, many students of psychology ities.
and education are taught that rewards For high-interest tasks, verbal re-
are harmful and that reward procedures wards are found to increase free choice
should be avoided in applied settings. and task interest. This finding repli-
Our finding of no overall effect of re- cates the results of Cameron and Pierce
ward, however, must be treated with (1994) and Deci et al. (1999). Most so-
caution. In our meta-analysis, the over- cial interaction in business, education,
all reward category lacked homogene- and clinical settings involves the use of
ity, indicating the appropriateness of a verbal praise and positive feedback
moderator analysis. In other words, the from managers, teachers, and thera-
overall reward category is too inclu- pists. When praise and other forms of
22 JUDY CAMERON et al.
FREE-CHOICE SELF-REPORTED
INTRINSIC MOTIVATION TASK INTEREST
Task odoing For dig For finishing Task Fr dbing For doing For finishing
non-contingent / wael task task non-contingent/ weN task task
(°) /(0) \ (O) (°) /()(0) \(+)
For each For surpassing For exceeding For each For surpassing For exceeding
unit solved a score others unit solved a score others
(-) (0) (+) (+) (+) (+)
/
Less than
\Maximun
nmaxmum reward reward
(-) (0)
Figure 5. A summary of the meta-analysis comparing free-choice intrinsic motivation and self-
reported task interest. 0 = no reliable effect; - = statistically significant negative effect of reward;
+ = statistically significant positive effect of reward.
positive feedback are given and later doing well at a task, they often choose
removed, our findings indicate that in- to do the activity less in a free-choice
terest and performance increase. period. The negative effect of rewards
The effects of tangible reward on offered for doing a task is also detected
measures of intrinsic motivation differ for the task-interest measure. We did
by reward expectancy. When rewards not find a negative effect on task inter-
are delivered unexpectedly (without a est when the rewards are offered for
description of the reward contingency), doing well. One possibility is that the
there is no evidence of a significant ef- true effect is negative but, at this point,
fect on either free choice or task inter- there are too few studies to yield a re-
est. This suggests that it is not tangible liable estimate. In general, when the
rewards per se that undermine moti- description of the reward contingency
vation and interest; instead it depends implies that rewards are loosely tied to
on instruction and the statement of performance, the evidence suggests
contingency. that people show a small reduction in
For high-interest tasks, when the re- performance and interest.
wards are tangible and expected (of- Figure 5 shows that rewards offered
fered beforehand), there are different for finishing or completing a task have
effects depending on the description of a nonsignificant effect on the free-
the reward contingency. When the of- choice measure but a positive effect on
fer of reward is unrelated to task be- task interest. Again, there were few
havior (task noncontingent), there is no studies in this category, and a firm con-
evidence for an effect of reward on ei- clusion about the effects is premature.
ther free choice or task interest. On the Stronger conclusions can be drawn for
other hand, when people are offered a the analysis of rewards offered for each
tangible reward for doing a task or for unit solved. When participants are of-
THE MYTH CONTINUES 23
fered a reward for each problem, puz- ative effects are produced when re-
zle, or unit solved, our findings indi- wards signify failure or are loosely tied
cate a negative effect on free choice to behavior.
and a positive effect on task interest. A
supplementary analysis involving less Durability of Reward Effects
than maximum reward and maximum
reward shows that the negative effect Deci et al. (1999) have claimed that
on free choice occurs when partici- negative effects of rewards are not
pants obtain less than the full reward. temporary. In a supplemental analysis,
In studies of less than maximum re- Deci et al. examined studies of chil-
ward, participants are given a time lim- dren in which the free-choice assess-
it to solve problems. Thus, the negative ment was conducted within a week fol-
effect may be a result of time pressure lowing the removal of reward and after
rather than reward. Another way to un- a week. Their analysis showed nega-
derstand this result is to consider what tive effects on free choice for each
less than maximum reward signifies to time of assessment. Deci et al. con-
participants. If people are told they can cluded that their results "indicate quite
obtain a certain level of reward but are clearly that the phenomenon of extrin-
given less than that level, they have re- sic rewards undermining intrinsic mo-
ceived feedback information that indi- tivation is not merely transitory" (p.
cates failure. In other words, this type 650). An examination of the studies in-
of situation may represent failure feed- cluded in Deci et al.'s supplementary
back, not reward. When participants analysis indicates that most of the ef-
are not under time pressure and are fect sizes were based on rewards of-
able to obtain the maximum reward, fered for doing the task or for doing
there is no significant effect on the well. When the free-choice assessment
free-choice measure. was conducted within a week follow-
When rewards are offered for meet- ing the removal of reward, 10 of 12
ing or surpassing a score, Figure 5 studies involved rewards offered for
shows no significant effect on free doing well or for doing the task. Of the
choice but a significant positive effect 14 studies with assessments conducted
on task interest. Rewards offered for more than a week later, 13 were con-
attaining a criterion are tightly linked cerned with rewards offered for doing
to level of performance. In this situa- well. Our interpretation of Deci et al.'s
tion, the rewards are tied to challenge findings is that it is rewards offered for
and mastery of the activity, and people doing (or doing well) that continue to
express interest in the task (see Ban- produce a negative effect on free
dura, 1986). When rewards are given choice, not extrinsic rewards in gener-
for exceeding the performance level of al. According to Bandura (1986), this
others, the results show a significant kind of reward procedure imparts little
increase on free choice and task inter- indication of competence, in that the
est. One possible explanation for the rewards are allocated without regard to
positive effects of this type of reward quality of performance and are thus
contingency is that rewards signify loosely tied to behavior.
competence, self-efficacy, or ability at An unresolved issue is whether there
the task, and people enjoy doing activ- is a change in free-choice intrinsic mo-
ities that reflect their competence. tivation over time. We examined seven
Overall, our analysis shows that tan- between-groups design studies of re-
gible rewards can be used to produce wards offered for doing the task that
both negative and positive effects on assessed whether negative effects were
measures of intrinsic motivation. Pos- maintained over time (Chung, 1995;
itive effects are obtained when the re- Loveland & Olley, 1979; Morgan,
wards are explicitly tied to perfor- 1983, Experiments 1 and 2; Ogilvie &
mance standards and to success; neg- Prior, 1982; Ross, 1975, Experiment 1;
24 JUDY CAMERON et al.
Shiffman-Kauffman, 1990). These sure, one could ask how much less
studies included two measures of free- time students would spend on high-in-
choice intrinsic motivation, one after terest tasks (e.g., art, music, reading,
the removal of reward and a second a drama) if a teacher implemented a re-
few weeks later. Only two of the seven ward system for doing the task (or do-
studies showed a significant negative ing well) and then removed it. Results
effect on the second measure (Morgan, from our meta-analysis indicate that
1983, Experiments 1 and 2). These re- the average effect size for a compari-
sults suggest that rewards offered for son between students who receive this
doing a task have transitory effects reward procedure and nonrewarded in-
when multiple measures of free-choice dividuals on time on task is about
motivation are used. This conclusion is -0.30.
strengthened by examining the results In the original experiments, free
from studies using repeated presenta- time on task was typically measured
tions of reward followed by repeated over an 8-min period. To convert the
assessments of intrinsic motivation fol- effect size of -0.30 to real time, one
lowing the removal of reward. needs to know the pooled standard de-
As previously indicated, some op- viation of rewarded and nonrewarded
erant researchers tested the effects of groups. Because many researchers re-
rewards on intrinsic motivation by ex- port only t or F statistics that cannot
perimental designs in which the same be converted to the overall pooled
individual was exposed to a baseline standard deviation, we are unable to
period, a reward intervention, and a re- provide an estimate of this parameter.
turn to baseline (Davidson & Bucher, Instead, we will use a well-designed
1978; Feingold & Mahoney, 1975; Ma- study by Pretty and Seligman (1984)
whinney et al., 1989; Skaggs et al., that provides a pooled standard devia-
1992; Vasta et al., 1978). Participants tion. Pretty and Seligman conducted
were measured repeatedly during each two experiments with large samples
phase of the experiment, and rewards and readily available statistical infor-
were shown to increase measures of mation. Both experiments compared a
performance, indicating that the re- condition of tangible rewards offered
wards functioned as reinforcement. for doing a high-interest task (Soma
The results of these experiments puzzles) with a nonrewarded control
showed that participants spent as much group on 8 min of free time. The
(or more) time on the target activity in pooled standard deviation was 2.6 min
the postreward phase as they did in the (Deci, 1971, also used Soma, the free-
initial baseline period. These findings time measure was assessed over an 8-
indicate that negative effects of reward min period, and the pooled standard
do not persist when task performance deviation was 2.4 min).
is rewarded on repeated occasions. Using 2.6 min as the estimate of er-
ror, we are able to convert the negative
Magnitude and Impact of effect size from the meta-analysis into
Reward Effects real time. An effect size of -0.30
would mean than in an 8-min period,
It may be informative to consider the average individual who is offered
how serious the negative effects are on a tangible reward for doing the task (or
high-interest tasks when the rewards doing well) will spend about 47 s less
are tangible, expected, and given for on the task when the reward is with-
doing a task or for doing well on a drawn than the average nonrewarded
task. In all of the studies involving individual. Given this result, what
these contingencies, time spent on the would happen if a teacher implemented
task during the free-choice period was this incentive procedure in a reading
the measure of free-choice intrinsic program (for children who already en-
motivation. Using the free-time mea- joy reading) and then removed it? Ac-
THE MYTH CONTINUES 25
cording to this estimate, students who from the results of Deci et al.'s (1999)
are offered gold stars for reading meta-analysis. Deci et al. present a pic-
would spend about 4 min less reading ture of pervasive negative effects. The
in a 40-min free-choice period than picture depicted in our analysis is one
students not given the incentive. If we of circumscribed negative effects. As
assume that students without reward noted, Deci et al. used reward contin-
spend about 30 min reading in the 40- gencies that were theoretically rele-
min free-choice period, then rewarded vant, but that were collapsed over dis-
students would spend about 26 min tinct reward procedures. For example,
reading (based on Deci et al.'s, 1999, on free-choice intrinsic motivation,
analysis of engagement-contingent re- Deci et al. showed a negative effect for
ward, rewarded children would spend performance-contingent rewards. The
about 25 min reading). A 4- to 5-min performance-contingent category in-
reduction in free-time reading could be cluded some studies of rewards offered
behaviorally important if cumulated for each unit solved, rewards offered
over many successive opportunities to for doing well, rewards offered for sur-
read, but there are no studies that havepassing a score, and rewards offered
addressed this issue. for exceeding others. By combining
A cautionary note is in order. Our these distinct procedures, Deci et al.
example of reading and reward de- obtained an overall negative effect for
pends on the use of a standard devia- performance-contingent reward. We
tion from a single well-designed study. show that these diverse reward proce-
dures produce different effects on free
It also depends on the ability to extrap-
olate from an 8-min experimental pe- choice; hence, it is unwise to collapse
riod to longer ones. It is possible thatthem into a single category of perfor-
the negative effects, such as they are, mance-contingent reward. Similarly,
are evident only for a short time at theDeci et al. collapsed over reward cat-
beginning of the free-choice period. egories for the task-interest measure,
That is, it may well be the case that ifand similar problems arise. In addition
an hour of free choice were given, re- to collapsing over different reward cat-
sults might look very different. The egories, Deci et al. omitted several pos-
point is that this is a hypothetical ex-itive effects that, when included, re-
ample. Further evidence is required to sulted in positive findings for task in-
generalize the findings to experiments terest. Overall, our meta-analysis indi-
with longer free-choice periods or to cates that rewards do not have
everyday settings in which choice is pervasive negative effects when minor
distributed over long periods of time. improvements to Deci et al.'s catego-
Given the state of the literature, werization of reward contingencies are
conclude that the negative effect of made and all available studies are in-
tangible rewards offered for doing a cluded.
high-interest task (or doing well) is sta- Using cognitive evaluation theory to
tistically significant, but the size of the
guide the classification of studies, Deci
effect does not suggest a strong impact.et al. (1999) obtained negative effects
Of course, our conclusion with regard of tangible reward contingencies. We
to the magnitude of the negative effectsshowed that by classifying studies ac-
of reward contingencies applies equal- cording to the actual contingency used,
ly to positive effects. That is, although
different effects were obtained. That
the positive effects are statistically sig-
the results of a meta-analysis can be
nificant, they too are small. drastically altered by assigning studies
to categories based on a particular the-
A Comparison of Our Findings to oretical orientation points to some im-
Those of Deci et al. (1999) portant issues and limitations in this lit-
Our pattern of findings for expected erature. The difference between our
tangible reward contingencies differs findings and those of Deci et al. points
26 JUDY CAMERON et al.
Study Differences
Adomey (1983)a Not included by Deci et al. The study assessed the effects
of tangible rewards offered for surpassing a score.
R. Anderson, Manoogian, and For tangible reward, Deci et al. noted in Appendix A of
Reznick (1976) their article that there was only one appropriate control
group comparison to sue (we used that group). However,
with verbal reward, Deci et al. used the inappropriate
control groups for their comparison.
S. Anderson and Rodin (1989) For verbal reward, Deci et al. did not include the free-
choice measure (reported on p. 461 of the original arti-
cle).
Bartelme (1983)a Deci et al. did not include the free-choice measure in their
analysis of performance-contingent rewards.
Boggiano and Barrett (1985) Not included by Deci et al. The study assessed the effects
of verbal reward on intrinsic motivation.
Boggiano, Main, and Katz (1988) Not included by Deci et al. The study assessed the effects
of verbal reward on intrinsic motivation.
Boggiano and Ruble (1979) Our free-choice effect size for tangible expected rewards
offered for doing (-0.61) was calculated from means
and SDs reported in the article; it is not clear how Deci
et al. obtained their estimate (-0.94).
Brennan and Glover (1980) Our free-choice effect size for tangible rewards offered for
doing (1.0) was calculated from the F value reported in
the article; it is not clear how Deci et al. obtained their
estimate (0.52).
Brewer (1980)a Our free-choice effect size (-0.08) for rewards offered for
doing well (Deci et al. label as performance contingent)
was calculated from means and the mean square error
reported in the dissertation; it is not clear how Deci et
al. obtained their estimate (-0.20).
Calder and Staw (1975) For low-interest tasks on the self-report measure (0.61) and
for tangible rewards offered for finishing task (-0.46),
we calculated the effect sizes from the F value reported
in the article; it is not clear how Deci et al. obtained
their estimates (1.01 for low-interest task and -0.76 for
expected tangible reward).
Carton and Nowicki (1998) Recent studies not included by Deci et al. The studies as-
Experiments 1 and 2 sessed the effects of tangible rewards offered for each
unit solved.
D. S. Cohen (1974)a The study was a 2 X 2 X 2 design; 2 levels of task choice
(choice, no choice), 2 levels of monetary reward (mon-
ey, no money) and 2 levels of verbal praise (praise, no
praise). For verbal reward, our effect size is based on
the main effect of verbal praise; Deci et al. omitted con-
ditions, and their effect size is based on verbal praise
versus no praise for the no-money no-choice condition
only.
Crino and White (1982) For verbal reward, Deci et al. reported a free-choice effect
size; there was no free-choice measure. This study also
included a low-interest task; the effect size for this was
not included in Deci et al.'s analysis of low-interest
tasks.
34 JUDY CAMERON et al.
APPENDIX A
Continued
Study Differences
Dimitroff (1984)J For tangible expected rewards for doing, our effect sizes
are based on the whole sample; Deci et al.'s effect sizes
are based on part of the sample. In addition, Dimitroff
reports no difference between the rewarded and control
groups (identical means) for the self-report measure;
Deci et al. report -0.26.
Eisenberger (1999) Recent study not included by Deci et al.; included in Ei-
senberger, Pierce, and Cameron (1999). The study as-
sessed the effects of tangible rewards offered for sur-
passing a score.
Eisenberger, Rhoades, and Cameron Recent study not included by Deci et al. The study as-
(1999) sessed the effects of tangible rewards offered for sur-
passing a score and tangible rewards offered for exceed-
ing others.
Eisenstein (1985) This study included a low-interest task; Deci et al. did not
include the effect size for free-choice unexpected reward
in their analysis of low-interest tasks.
Feehan and Enzle (1991) Mislabeled by Deci et al. as Experiment 2.
Experiment 1
Feehan and Enzle (1991) Not included by Deci et al.
Experiment 2
Freedman and Phillips (1985) Not included by Deci et al. who stated that the tasks used
were uninteresting. Close inspection of the study, how-
ever, suggests that both a high- and low-interest task
were used (see comments on Phillips & Freedman,
1985). Deci et al. did not include the study in their pri-
mary analysis of the effects of reward on high-interest
tasks or in their supplemental analysis of reward effects
on low-interest tasks.
G. S. Goldstein (1980)a Deci et al. did not include the free-choice effect size.
L. W. Goldstein (1977)a The study was a 2 X 2 design; 2 levels of tangible reward
(reward, no reward) and 2 levels of verbal feedback
(feedback, no feedback). For verbal reward, our effect
size is based on the main effect of verbal feedback; Deci
et al. compared feedback to no feedback in the no-tangi-
ble-reward condition only.
Griffith (1984)a In this study, there were two rewarded groups. In one
group, participation was individual, in the other, it was
group participation. In their analysis of low-interest tasks
and in their analysis of engagement-contingent reward,
Deci et al. included only the effects of reward for the
individual context (but their sample size indicates they
may have actually used both). We used both contexts
and calculated effect sizes from F values, means, and
SDs.
Harackiewicz et al. (1984) Our effect sizes for rewards offered for exceeding others
Experiments l and 3 are based on a comparison of rewarded groups to a con-
trol group given feedback and to a control group given
feedback with performance objectives comparable to the
reward group. Deci et al. omitted one of the control
groups and based their effect size on a comparison to
the feedback control only.
Hom (1987) Experiments 1 and 2 Excluded by Deci et al. who stated that there was too little
information to include. We included Experiment 2 in the
verbal reward category (there was sufficient informa-
tion); Experiment 1 was included in the overall reward
analysis and in the analysis of tangible rewards. There
was no information about reward expectancy or contin-
gency; we did not include the study in these analyses.
THE MYTH CONTINUES 35
APPENDIX A
Continued
Study Differences
Kast and Connor (1988) The study was concerned with the effects of verbal feed-
back on intrinsic motivation. There were 180 participants
in the rewarded group and 60 in the control group; Deci
et al. report 90 and 30.
Kruglanski et al. (1975) Deci et al. excluded an analysis of the effects of rewards
Experiment 2 on the self-report measure for one of the tasks (stock
market game).
Lepper, Sagotsky, Dafoe, and Greene Our free-choice effect size for rewards offered for doing
(1982) Experiment 3 (-0.13) was calculated from means and SDs; it is not
clear how Deci et al. obtained their estimate (-0.50).
McLoyd (1979) Deci et al. did not include the self-report measure in their
analysis of completion-contingent rewards.
Mynatt, Oakley, Piccione, Margolis, Our free-choice effect size for expected tangible rewards
and Arkkelin (1978) offered for doing (0.19) is based on between-group dif-
ferences; Deci et al. used a within-group comparison
(-0.11).
Okano (1981) Experiment 2 For task-noncontingent reward, our free-choice effect size
(-0.47) was calculated from means and SDs reported in
the article; it is not clear how Deci et al. obtained their
estimate (-0.84). On the self-report measure, Deci et
al.'s effect size should be negative, not positive.
Overskeid and Svartdal (1996) Excluded by Deci et al. because the task was of low initial
Experiments 1 and 2 interest. This study should have been included in Deci et
al.'s analysis of reward effects on low-interest tasks.
Patrick (1985)a There is not enough information in the study to calculate a
free-choice or self-report effect size.
Phillips and Freedman (1985) Excluded by Deci et al. who stated that the tasks used
were uninteresting. Two tasks were used in this study;
one was rated above the median on a 7-point scale (3.8),
the other was rated below the median (2.7) (see p. 307
of the original article). Freedman and Phillips (1985)
was also in accord with this scale. This study was not
included in either Deci et al.'s primary analysis of the
effects of reward on high-interest tasks or their supple-
mental analysis of reward effects on low-interest tasks.
Picek (1976) In this study, there were three reward conditions; one
group received a reward for doing the task (we included
this in our analysis of rewards offered for doing). In the
other two reward groups (one was labeled performance
contingent by Deci et al.), only half the participants in
the rewarded conditions were actually offered and given
a reward. Because not all participants were offered a re-
ward, we excluded these two conditions from our analy-
sis.
Pittman, Cooper, and Smith (1977) Our free-choice effect size (-0.50) for rewards offered for
surpassing a score (Deci et al. label the study perfor-
mance contingent) was calculated from the p value re-
ported in the article; it is not clear how Deci et al. ob-
tained their estimate (-1.46).
Pittman, Emery, and Boggiano Deci et al. did not include the self-report measure for en-
(1982) Experiment 1 gagement-contingent rewards; a self-report measure is
reported in the article on page 792.
Pretty and Seligman (1984) The study was 3 X 3 design; 3 levels of tangible reward
Experiment 1 (expected, unexpected, no reward) and 3 levels of feed-
back (positive, negative, no feedback). For verbal re-
ward, we compared positive feedback to no feedback
across all reward conditions; Deci et al. compared posi-
tive feedback to no feedback in the no-reward condition
only.
36 JUDY CAMERON et al.
APPENDIX A
Continued
Study Differences
Ross (1975) Experiment 1 For rewards offered for doing, our free-choice effect size
(0.01) was calculated from t values reported in the arti-
cle; it is not clear how Deci et al. obtained their estimate
(-0.18).
Shapira (1976) Excluded by Deci et al. who stated that rewarded partici-
pants worked on an easier task than the control group.
Both groups worked on Soma puzzles. Because task dif-
ficulty was not a variable assessed in our meta-analysis,
we included this study.
A. T. Smith (1980)a Deci et al. did not include the free-choice effect size for
verbal reward or the effect size for low-interest tasks in
their analysis of low-interest tasks.
T. W. Smith and Pittman (1978) Our free-choice effect size (-0.56) for rewards offered for
surpassing a score (Deci et al. label the study perfor-
mance contingent) was calculated from the p value re-
ported in the article; it is not clear how Deci et al. ob-
tained their estimate (-94).
W. E. Smith (1975)a For verbal reward, Deci et al. used only part of the sample.
There were two verbal reward groups (n = 40) and a
control condition (n = 20); Deci et al. report 20, 20. For
unexpected reward, Deci et al. also used only part of the
sample.
Thompson, Chaiken, and Hazlewood For engagement-contingent rewards (or rewards offered for
(1993) doing a task), Deci et al. report the self-report effect size
as the free-choice effect size and did not give an effect
size for the self-report measure.
Tripathi (199 1) Not included by Deci et al. The study assessed the effects
of tangible rewards offered for doing a task and for sur-
passing a score.
Tripathi and Agarwal (1988) For rewards offered for doing, our free-choice effect size
(0.34) was calculated from F values reported in the arti-
cle; it is not clear how Deci et al. obtained their estimate
(0.00).
Weiner (1980) Our free-choice effect size (0.35) for rewards offered per
unit solved (Deci et al. labeled the study completion
contingent) was calculated from means and SDs reported
in the article; it is not clear how Deci et al. obtained
their estimate (0.20).
Wicker et al. (1990) Excluded by Deci et al. who reported that the article was
not about the effects of rewards on intrinsic motivation
and none of the appropriate statistics were available. We
wrote to the first author and obtained the data for the
free-choice and self-report measures.
Williams (1980) For rewards offered for doing, our free-choice effect size is
based on the whole sample. Although Deci et al. report
the entire sample, their effect size appears to be based
on only part of the sample.
Wilson (1978)a In this study, there were two reward groups (one received
$2.50; the other group received $0.50); Deci et al. in-
cluded only one of the reward groups in their analysis of
low-interest tasks and in their analysis of engagement-
contingent rewards. We included both. For other studies
that used two different magnitudes of rewards, Deci et
al. included both (e.g., McLoyd, 1979).
a
Unpublished doctoral dissertation.
THE MYTH CONTINUES 37
APPENDIX B
A comparison of our classification of reward contingencies with Deci,
Koestner, and Ryan (1999)
Our classification of Differences between our
the reward contingency categories and Deci et al.'s (1999)
Task-noncontingent reward Also labeled task noncontingent by Deci et al.
The same set of studies was used in both analyses.
Rewards offered for doing well All studies included in our analysis of this category were la-
beled performance contingent by Deci et al.
Rewards offered for doing the Studies included in our analysis of this category were labeled
task engagement contingent by Deci et al.
We included L. W. Goldstein (1977) in this category. In this
study participants were offered a reward to take pictures;
no instructions were given about completing the task and
there was no requirement to do well or to achieve any spe-
cific standard. Deci et al. labeled this study as completion
contingent.
Rewards offered for finishing or Studies in our analysis of this category were labeled comple-
completing a task tion contingent by Deci et al.
We included Tripathi and Agarwal (1985) in this category. In
this study rewarded participants were told they could earn
a reward if they completed the task even if all solutions
were not correct. Deci et al. labeled this study as engage-
ment contingent.
Rewards offered for each puzzle Most of the studies in our analysis of this category were la-
or problem solved beled completion contingent by Deci et al.
Five studies in this category were labeled performance con-
tingent by Deci et al. (Bartelme, 1983; D. S. Cohen, 1974;
Effron, 1976; Lee, 1982; Weiner & Mander, 1978). In each
of these studies, participants were offered pay for each
point earned on a puzzle-solving task, each word found in
a word-search task, each code completed on a decoding
task, or each correct answer on a matching-to-sample task.
Rewards offered for surpassing All studies in our analysis of this category were labeled per-
a score formance contingent by Deci et al.
Rewards offered for exceeding a All studies in our analysis of this category were labeled per-
norm formance contingent by Deci et al.
38 JUDY CAMERON et al.
APPENDIX C
Studies included in the analysis of the effects of rewards on intrinsic
motivation for tasks with low initial interest
Re- Free- Self-
ward choice report
Re- ex- effect effect
ward pec- Reward size size
Study type tancy contingency NE NC (g) (g)
Calder and Staw (1975) T E For finishing task 10 10 0.61
Chung (1995) T E For doing task 5 5 1.93
T E Insufficient information 5 5 1.22
Crino and White (1982) V U Per unit solved 10 5 -0.05
V U Yoked per unit 10 5 0.32
Daniel and Esser (1980) T E For doing quickly 16 16 -0.28 0.08
Eisenstein (1985) T U For finishing task 6 6 0.62
T E For finishing 16 6 0.22
Freedman and Phillips T E Per unit solved 24 25 0.24
(1985) T E For finishing task 26 25 0.53
Griffith (1984)a T E For doing task 44 44 0.25
Hamner and Foster (1975) T E For doing task 16 15 -0.28
T E Per unit solved 19 15 0.52
Hitt, Marriott, and
Esser (1992) T E For doing task 30 15 0.57 -0.16
Loveland and Olley (1979) T E For doing task 6 6 1.20
McLoyd (1979) T E For finishing task 18 9 0.61 0.00
Mynatt et al. (1978) T E For doing task 5 5 1.35
Newman and Layton
(1984) T E For doing task 20 10 0.41
Overskeid and Svartdal
(1996) Experiment 1 T E For doing task 10 10 -0.29 0.39
Overskeid and Svartdal
(1996) Experiment 2 T E For doing task 64 32 -0.15
Phillips and Freedman T E For finishing task 12 12 0.63
(1985) T E Per unit solved 12 12 -0.10
A. T. Smith (1980)a T E For doing task 21 27 0.04
V U For doing task 22 26 0.17
Wilson (1978)a T E For doing task 46 23 -0.03 0.12
Note. T = tangible reward, V = verbal reward, E = expected, U = unexpected, NE = sample
size of experimental group, Nc = sample size of control group.
a
Unpublished doctoral dissertation.
THE MYTH CONTINUES 39
APPENDIX D
Studies included in the analysis of the effects of verbal rewards on intrinsic
motivation for tasks with high initial interest
Free- Self-
choice report
effect effect
Reward Reward size size
Study expectancy contingency NE NC (g) (g)
R. Anderson et al. (1976) U For doing task 18 19 0.40
S. Anderson and Rodin
(1989) U For doing task 10 10 0.20 0.40
Blanck, Reis, and Jackson
(1984) Experiment 1 U For doing task 70 69 0.56 0.69
Blanck et al. (1984)
Experiment 2 U For doing task 12 12 0.73 0.00
Boggiano and Barrett (1985) U For doing task 18 18 0.35
Boggiano et al. (1988) U For doing task 66 34 0.42
Butler (1987) E For doing task 50 50 1.59b
D. S. Cohen (1974)a U For doing task 52 52 0.07 0.42
Crino and White (1982) U Per unit solved 10 5 0.05
U Yoked per unit 10 5 -0.79
Danner and Lonkey (1981) U For doing task 30 30 -0.10 -0.08
Deci (1971) Experiment 3 U For doing task 12 12 0.82 0.00
Deci (1972b) U For doing task 48 48 0.29
Deci, Cascio, and Krusell
(1975) No information No information 32 32 0.02
Dollinger and Thelen (1978) E For doing well 12 12 -0.07 0.00
Effron (1976)a U For doing task 15 13 0.89
L. W. Goldstein (1977)a U For doing task 32 32 0.77 0.12
Harackiewicz (1979) U For doing task 31 31 0.59
Hom (1987) Experiment 2 No information No information 28 28 -0.37
Kast and Connor (1988) U For doing task 180 60 -0.46b
Koestner, Zuckerman, and
Koestner (1987) U For doing task 35 18 0.51 0.00
Orlick and Mosher (1978) U For doing task 11 12 -0.34
Pallak, Costomotis, Sroka, U For doing task 16 12 -0.47
and Pittman (1982) E For doing task 14 12 0.32
Pittman, Davey, Alafat, Weth-
erill, and Kramer (1980) U For doing task 24 12 0.80
Pretty and Seligman (1984)
Experiment 1 U For doing task 30 30 0.35 0.46
Ryan, Mims, and Koestner
(1983) E For doing task 32 16 0.53 0.00
Sansone (1986) U For doing task 44 11 0.68
Sansone (1989) E For doing task 82 41 0.46
Sansone, Sachau, and Weir
(1989) U For doing task 40 40 0.12
Shanab, Peterson, Dargahi,
and Deroian (1981) U For doing task 20 20 0.64 0.43
A. T. Smith (1980)a U For doing task 21 27 0.24
W. E. Smith (1975)a U For learning 20 20 0.04 0.00
Tripathi and Agarwal (1985) E For doing task 20 20 1.61 0.48
Vallerand (1983) E For doing task 40 10 1.98b
Vallerand and Reid (1984) E For doing task 28 28 0.53
Zinser, Young, and King
(1982) U For doing task 64 32 0.08
Note. U = unexpected, E = expected, NE = sample size of experimental group, NC = sample size
of control group.
a Unpublished doctoral dissertation.
b Outliers in the data set.
40 JUDY CAMERON et al.
APPENDIX E
Studies included in the analysis of unexpected tangible rewards on intrinsic
motivation for tasks with high initial interest for all reward contingencies
Free- Self-
choice report
effect effect
Reward size size
Study contingency NE NC (g) (g)
Eisenstein (1985) For finishing 10 10 0.46
Greene and Lepper (1974) For doing well 26 15 0.14
Harackiewicz et al. (1984) Experiment 2 Exceeding others 15 15 0.44b 0.15
Kruglanski, Alon, and Lewis (1972) For winning 36 33 -0.65
Lepper et al. (1973) For doing task 18 15 0.12
Orlick and Mosher (1978) For doing well 12 12 -1.28
Pallak et al. (1982) For doing task 15 12 -0.43
Pretty and Seligman (1984) Experiment 1 For doing task 30 30 0.06 0.42
Pretty and Seligman (1984) Experiment 2 For doing task 30 30 0.06 0.38
W. E. Smith (1975)a For learning 40 40 0.06 0.00
Note. NE = sample size of experimental group, N, = sample size of control group.
a Unpublished doctoral dissertation.
bEffect sizes based on performance measures on the task during the free-choice period (e.g.,
number of balls played in a pinball game, number of trials initiated in a labyrinth game, number
of words found in a word search game). See text for details.
THE MYTH CONTINUES 41
APPENDIX F
Studies included in the analysis of expected tangible rewards on intrinsic
motivation for tasks with high initial interest, listed by reward contingency
Free- Self-
choice report
effect effect
size size Reward
Study NE NC (g) (g) delivery
Task noncontingent
Dafoe (1985)a 25 28 -0.20 0.73
Deci (1972a) 24 16 0.08
Earn (1982) 40 20 -0.28 0.18
Kruglanski, Friedman, and Zeevi (1971) 16 16 -0.69
Okano (1981) Experiment 2 11 11 -0.47 -0.27
Pittman et al. (1982) Experiment 1 10 10 0.26 0.00
Ross, Karniol, and Rothstein (1976) 12 12 0.44
Swann and Pittman (1977) Experiment 1 20 20 -0.21
Wimperis and Farr (1979) 16 16 0.56
Rewards offered for doing task
Amabile, Hennessey, and Grossman
(1986) Experiment 1 56 57 0.00 0.00
Amabile et al. (1986) Experiment 3 30 30 0.00
R. Anderson et al. (1976) 36 19 -0.53
Arnold (1976) 17 36 0.00
Arnold (1985) 13 16 -0.04
Boggiano and Ruble (1979) 20 20 -0.61
Boggiano, Havackiewicz, Besette, and Main
(1985) 26 13 -0.79
Boggiano, Ruble, and Pittman (1982) 81 84 0.28
Brennan and Glover (1980) 20 19 1.00b
Brewer (1980)a 24 24 -0.13 0.12
Chung (1995) 5 5 -1.6lb
Danner and Lonkey (1981) 30 30 - 1.33b -1.23
DeLoach, Griffith, and LaBarba (1983) 26 26 0.00
Dimitroff (1984)a 108 36 -0.27 0.00
Effron (1976)a 12 13 0.19
Fabes, McCullers, and Hom (1986) 24 24 0.06 -0.14
Fabes et al. (1988) 14 14 - 1.34b -0.76
Fabes, Fultz, Eisenberg, May-Plumlee, and
Christopher (1989) 15 14 -0.73
Feehan and Enzle (1991) Experiment 1 24 12 -0.97
L. W Goldstein (1977)a 16 16 -0.99 -0.87
Greene and Lepper (1974) 15 15 -0.70
Griffith (1984)a 44 44 -0.23
Hamner and Foster (1975) 15 15 -0.14
Harackiewicz (1979) 31 31 -0.38
Hitt et al. (1992) 30 15 -0.82 -0.47
Hyman (1985)a 32 32 -0.42
Karniol and Ross (1977) 17 20 -0.08
Lepper et al. (1973) 18 15 -0.72
Lepper et al. (1982) 32 32 -0.13
Loveland and Olley (1979) 6 6 -1.20
Morgan (1981) Experiment 1 27 27 -0.98 -0.31
Morgan (1981) Experiment 2 20 20 -0.77 0.04
Morgan (1983) Experiment 1 40 40 -1.94b
40 20 -0.54
42 JUDY CAMERON et al.
APPENDIX F
Continued
Free- Self-
choice report
effect effect
size size Reward
Study NE NC (g) (g) delivery
Morgan (1983) Experiment 2 20 20 -0.66 0.00
Mynatt et al. (1978) 5 5 0.19
Newman and Layton (1984) 20 10 -0.37
Ogilvie and Prior (1982) 26 26 -0.08
Okano (1981) Experiment 1 15 15 -0.99 -0.45
Okano (1981) Experiment 2 10 11 -1.3lb 0.00
Patrick (1985)a 33 31 0.00 0.00
Perry, Bussey, and Redman (1977) 32 32 -0.43 -0.21
Picek (1976)a 10 10 0.00 -0.65
Pittman et al. (1982) Experiment 1 10 10 0.17 0.00
Pittman et al. (1982) Experiment 2 27 27 -0.05
Pretty and Seligman (1984) Experiment 1 30 30 -0.75 -0.05
Pretty and Seligman (1984) Experiment 2 30 30 -0.13 -0.16
Reiss and Sushinsky (1975) 16 16 -0.83
Ross (1975) Experiment 1 40 20 0.01 -0.45
Ross (1975) Experiment 2 52 14 -0.66 0.00
Ross et al. (1976) 12 12 -0.64
Ryan et al. (1983) 16 16 -0.35 0.00
Sarafino (1984) 85 15 -0.41 0.00
Shiffman-Kauffman (1990)a 20 20 0.06 -0.04
A. T. Smith (1980)a 21 27 -0.82
Swann and Pittman (1977) Experiment 1 20 20 -0.78
Swann and Pittman (1977) Experiment 2 26 13 -1.01
Thompson et al. (1993) 34 33 -0.003 0.14
Tripathi (1991) 20 5 0.00 0.00
Tripathi and Agarwal (1988) 20 10 0.34b 0.72
Weiner and Mander (1978) 30 30 -0.34 0.00
Williams (1980) 24 24 0.18 0.00
Wilson (1978)a 46 23 -0.06 -0.01
Yuen (1984)a 60 60 -0.40 -0.12
Rewards offered for "doing well" or "doing a good job" on the task
Brewer (1980)a 48 24 -0.08 0.12 M
Dafoe (1985)a 26 28 0.00 0.59 M
Dollinger and Thelen (1978) 36 12 -0.55 0.00 L
Enzle, Roogeveen, and Look (1991) 40 10 -0.53 M
Fabes (1987) Experiment 1 18 19 -0.87 M
L. W. Goldstein (1977)a 16 32 -0.08 -0.48 M
Greene and Lepper (1974) 15 15 -0.57 M
Hyman (1985)' 16 16 0.11 M
Orlick and Mosher (1978) 14 12 -0.53 M
Pallak et al. (1982) 15 12 -0.17 M
Ryan et al. (1983) 32 32 -0.46 0.00 M
Taub and Dollinger (1975) 124 124 0.00 NI
Rewards offered for completing a task
Calder and Staw (1975) 10 10 -0.46 M
Eisenstein (1985) 18 10 -0.53 M
Fabes (1987) Experiment 1 19 19 -0.82 M
Fabes (1987) Experiment 2 14 14 -0.45 M
Freedman and Phillips (1985) 26 22 0.94 M
Griffith, DeLoach, and LaBarba (1984) 64 32 0.00 M
McLoyd (1979) 18 9 -1.04 0.00 M
Phillips and Freedman (1985) 12 12 0.74 M
THE MYTH CONTINUES 43
APPENDIX F
Continued
Free- Self-
choice report
effect effect
size size Reward
Study NE NC (g) (g) delivery
Staw, Calder, Hess, and Sanderlands 47 46 0.19 M
(1980)
Tripathi and Agarwal (1985) 20 20 0.41 0.54 M
Rewards offered for each problem, puzzle, or unit solved
Arkes (1979) 32 32 -0.16 0.03 M
Arnold (1985) 13 16 -0.05 L
Bartelme (1983)a 35 34 0.04c 0.03 M
Boggiano et al. (1985) 26 13 -0.10 M
Brockner and Vasta (1981) 26 26 -0.37 -0.58 L
Carton and Nowicki (1998) Experiment 1 44 22 0.36c L
Carton and Nowicki (1998) Experiment 2 40 20 0.20c 0.71 L
D. S. Cohen (1974)a 52 52 -0.18 0.13 L
Deci (1971) Experiment 1 12 12 -0.54 0.00 L
Deci (1972b) 64 32 0.33 L
Effron (1976)a 43 28 -0.04 L
Feehan and Enzle (1991) Experiment 2 30 15 0.31c M
Freedman and Phillips (1985) 23 22 1.12 L
G. S. Goldstein (1980)a 14 14 -0.32 0.68 L
Hamner and Foster (1975) 18 15 -0.21 L
Kruglanski et al. (1975) Experiment 1 24 24 1.15b M
Lee (1982)a 40 40 -0.36c 0.35 M
Liberty (1986) Experiment la 23 23 -0.86c -0.34 L
Liberty (1986) Experiment 2a 44 42 -0.22c 0.04 L
McGraw and McCullers (1979) 20 20 -0.04 NI
Phillips and Freedman (1985) 12 12 0.77 L
Porac and Meindl (1982) 20 20 -0.78 L
Shapira (1976) 30 30 0.41 L
Sorensen and Maehr (1976) 20 20 -0.54 L
Vasta and Stirpe (1979) 4 5 -0.16 L
Weiner (1980) 24 24 0.35 0.00 M
Weiner and Mander (1978) 30 30 -0.54 0.00 L
Wicker et al. (1990) 29 29 -0.46 0.18 L
Wimperis and Farr (1979) 16 16 1.36b NI
Rewards offered for meeting a specific standard or surpassing a score
Adorney (1983)a 35 36 0.39 0.48 L
Bartelme (1983)a 35 34 0.19C -0.03 M
Boggiano and Ruble (1979) 20 20 -0.17 M
Dafoe (1985)a 28 28 0.15 0.59 M
Eisenberger (1999) 214 316 0.08 0.31 M
Eisenberger, Rhoades, and Cameron (1999)
Experiment 1 110 113 0.10 0.34 M
Harackiewicz, Abrahams, and Wageman
(1987) 13 25 -0.28 M
Hyman (1985)a 16 16 0.04 M
Kruglanski et al. (1975) Experiment 2 40 40 0.38 M
Patrick (1985)a 30 31 0.00 0.00 M
Pittman et al. (1977) 60 20 -0.50c -0.20 L
T W. Smith and Pittman (1978) 66 33 -0.56c 0.00 L
Tripathi (1991) 20 5 0.00 0.00 M
44 JUDY CAMERON et al.
APPENDIX F
Continued
Free- Self-
choice report
effect effect
size size Reward
Study NE NC (g) (g) delivery
Rewards offered for meeting or exceeding others
Dafoe (1985)a 25 28 0.00 0.59 M
Eisenberger, Rhoades, and Cameron (1999)
Experiment 1 106 106 0.38 0.22 M
Harackiewicz (1979) 31 31 -0.87 M
Harackiewicz and Manderlink (1984) 47 47 0.33 M
Harackiewicz et al. (1984) Experiment 1 32 64 0.27 0.12 M
Harackiewicz et al. (1984) Experiment 2 15 15 _0.43c -0.18 M
Harackiewicz et al. (1984) Experiment 3 26 52 0.34c 0.40 M
Harackiewicz et al. (1987) 11 29 0.12 M
Kamiol and Ross (1977) 20 20 0.15 IJM
Luyten and Lens (1981) 10 10 -0.90 0.08 L
Rosenfield, Folger, and Adelman (1980) 30 27 0.30 0.22 L/M
Salincik (1975) 38 39 -0.34 0.01 M
Shiffman-Kauffman (1990)a 20 20 0.35 0.00 M
Tripathi and Agarwal (1988) 20 10 0.87 1.01 M
Weinberg and Jackson (1979) 40 40 0.00 L
Note. NE = sample size of experimental group, Nc = sample size of control group, M = maximum
reward, L = less than maximum reward, NI = not enough information.
a Unpublished doctoral dissertation.
b Outliers in the data set.
c Effect sizes based on performance measures during the free-choice period. See text for details.
APPENDIX G
Studies or conditions within studies included in the overall analyses of
reward and tangible reward that could not be classified into
reward contingencies
Free- Self-
choice report
effect effect
Reward size size
Study contingency NE NC (g) (g)
Chung (1995) Insufficient information 5 5 -1.02
Daniel and Esser (1980) For doing quickly 16 16 -0.75 -0.71
Hom (1987) Experiment 1 No information 26 26 0.12 0.00
Hom (1987) Experiment 2 No information 28 28 -0.37b
W E. Smith (1975)a For showing learning 40 40 -0.22 0.00
Note. NE = sample size of experimental group, N, = sample size of control group.
a Unpublished doctoral dissertation.
bEffect size based on performance measures during the free-choice period. See text for details.