The False Promise of Class-Size Reduction
The False Promise of Class-Size Reduction
w w w.americanprogress.org
The False Promise of Class-Size
Reduction
Matthew M. Chingos April 2011
Contents 1 Introduction and summary
10 Looking ahead
12 References
13 Endnotes
Policymakers across the nation, including those in at least 24 states, have taken
these ideas to heart and enacted CSR initiatives at costs upward of billions of dol-
lars.2 California allocated $1.5 billion per year in the late 1990s to reduce class size
in the early grades. Florida has spent about $20 billion since 2002 reducing class
size in every grade from kindergarten through high school.3 The federal govern-
ment also has its own program, which provided $1.2 billion to $1.6 billion per
year from 1999 to 2001 for CSR in grades K–3. This program was absorbed into
Title II of the No Child Left Behind Act in 2001.4
These policies, coupled with trends in local school districts, have produced a wide-
spread reduction in the number of students per teacher over the past four decades.5
Figure 1 shows that the pupil-teacher ratio in public schools has fallen by about 30
percent since 1970. This trend partly reflects an increase in educational services to
students with disabilities, as required by federal law beginning in 1975.6 But falling
pupil-teacher ratios affected all students, as evidenced by the even steeper drop at
private schools (which serve fewer disabled students).7 The trend at private schools
also likely reflects the strong preference of parents for small classes and the greater
incentive for private schools to respond to those preferences.
The evidence on class size indicates that smaller classes can, in some circum-
stances, improve student achievement if implemented in a focused way. But CSR
policies generally take exactly the opposite approach by pursuing across-the-
board reductions in class size at the state or federal level. These large-scale, untar-
geted policies are also extremely expensive and represent wasted opportunities to
make smarter educational investments.
Large-scale CSR policies clearly fail any cost-benefit test because they entail steep
costs and produce benefits that are modest at best. But what about reductions in
class size at the district or school level? When school finances are limited (as they
always are), the cost-benefit test any educational policy must pass is not “Does
this policy have any positive effect?” but rather “Is this policy the most produc-
tive use of these educational dollars?” Assuming even the largest class-size effects
in the research literature, such as the STAR results that indicate that a 32 per-
cent reduction in class size increased achievement by about 15 percent of a year
of learning after one year, CSR will still fail this test because it is so expensive.
Reducing class size by one-third, from 24 to 16 students, requires hiring 50 per-
cent more teachers. Depending on how much extra space schools have, new facili-
ties may need to be built to accommodate the additional classes.
There are certainly many policies that might be proposed as cost-effective alterna-
tives to CSR, but one set of policies that stand out are those aimed at improving
teacher quality. Researchers agree that teacher quality is the single most important
in-school determinant of how much students learn.9 Stanford economist Eric
The fact that across-the-board CSR policies at the state or district level are not
cost-effective does not mean that smaller classes should never be used, but rather
that they should be reserved for use in special cases by individual schools. A
principal may decide, for example, that a smaller class makes sense for an inexperi-
enced teacher who needs support in developing skills to provide accommodations
for students with disabilities. At the same time, the principal may want to assign a
larger class to a highly effective veteran teacher, perhaps with some extra compen-
sation for the additional work required. School districts should encourage this
kind of creative management and enable it by collecting and providing to princi-
pals detailed data on their teachers and the classes they teach.
The vast majority of class-size studies are not rigorous, so their results are not
very useful as a guide to policy. The primary difficulty in studying class size is that
schools with different class sizes likely differ in many other, difficult-to-observe
ways. More affluent schools are more likely to have the resources needed to pro-
vide smaller classes, which would create the illusion that smaller classes are better
when in fact family characteristics were the real reason. Alternatively, a school
that serves many students with behavior problems may find it easier to manage
these students in smaller classes. A comparison of such schools to other schools
might give the appearance that small classes produce less learning when in fact the
behavior problems were the main factor.
Studies that do not carefully isolate the causal effect of class size (and only class
size) produce widely varying results. Stanford’s Eric Hanushek compiled 276
estimates of class-size effects from 59 studies, and found that only 11 percent of
these estimates indicated positive effects of smaller classes.12 A similar number
(9 percent) were negative, with the remaining 80 percent not statistically distin-
guishable from zero. Princeton economist Alan Krueger argued for an alternative
method of counting the estimates, but this change only increased the proportion
of studies showing positive effects to 26 percent, with the majority showing either
negative or insignificant effects.13 One way to interpret these tallies is that class
size matters in some circumstances but not others. Another plausible explanation
is that unreliable studies produce unreliable results.
The only way to credibly measure the causal effect of class size is to compare
students who are in larger or smaller classes for reasons unrelated to their
achievement. One way to do this is a controlled experiment, where research-
ers randomly assign students to smaller and larger classes. The only large-scale
class-size experiment carried out in the United States is the Student Teacher
Achievement Ratio, or STAR, study, which was conducted in Tennessee during
the late 1980s.14 Beginning with the entering kindergarteners in 1985, students
and teachers were randomly assigned to a “small” class, with an average of 15
The results of the STAR experiment after one year were encouraging: Students in
the small kindergarten classes outperformed students in the regular-size classes
on standardized tests by about 15 percent of a year of learning.16 But this effect did
not increase after one year, and decreased by the end of the third grade when the
experiment ended.17 In other words, the calculation of the class-size effect depends
enormously on the time frame used. The bump in test scores after one year would be
impressive if it didn’t erode over time despite the continued use of small classes.18
Students who entered the STAR experiment in later years also saw gains from
being in the smaller class that were generally concentrated in the first year of
participation. Additionally, the positive effects of class size in Project STAR were
largest for black students, economically disadvantaged students, and boys.19
The other prominent high-quality study of class size in the United States is
Stanford economist Caroline Hoxby’s examination of Connecticut schools using
data from the 1980s and 1990s.20 Hoxby takes advantage of differences in class
size that result from random changes in the school-age population. For example,
a small school that has 15 first-grade students in one year and 18 the next year
would have a larger class during the second year. Additionally, a school that has
set a class-size limit of 25 would have one second-grade class of 25 if there were
25 second-grade students but two classes of 13 if there were 26 students. Hoxby
finds no relationship between class size and achievement in fourth and sixth grade
(which should reflect class size in all previous grades). Hoxby does not even find
class-size effects at schools that serve disproportionately large shares of disadvan-
taged or minority students.
Even though these studies cannot be reconciled, the important point is that we just
don’t know a whole lot about the impact of CSR on student achievement. The two
studies of the early grades have conflicting results, and we know little about the later
grades. The only high-quality study of class size in middle or high school is Thomas
Dee and Martin West’s analysis of eighth-graders in a nationally representative data
set.25 Dee and West compare the outcomes of the same students who attended
different-size classes in different subjects. They find no impact of class size on test
scores (except for a small effect in urban schools), but do find modest effects on
noncognitive skills such as student attentiveness and attitudes about learning.
Research on large-scale CSR initiatives suggests that such policies are unlikely to
produce the kind of results seen in the STAR experiment. California’s late-1990s
policy that reduced K–3 class sizes by about 10 students (from 30 to 20, on aver-
age) is difficult to evaluate because statewide tests were not administered until
after the program began. The most careful study of this multibillion-dollar policy
found that it had modest effects on student achievement of 6 percent to 11 per-
cent of a year of learning, some of which were offset by the hiring of inexperienced
teachers to lead newly created classes, particularly in the first few years of the pol-
icy.26 Other unintended negative consequences of the California policy included
an increase in class size in grades 4 and 5 and the use of multigrade classrooms.27
Florida enacted an even broader CSR policy that imposed specific caps on class
size in every core-subject classroom in every grade.28 This policy cost about $20
billion to implement during its first eight years, with continuing costs of $4 billion
to $5 billion each subsequent year. In a recent study, I found no evidence that
the Florida policy had any impact on test scores in grades 4 through 8, perhaps
because class size does not make much of a difference in these grades.29 I also
found no impacts on third-grade scores, which would be affected by class sizes
There is clearly a need for more rigorous studies of class size, but two important
conclusions emerge from the existing research. First, across-the-board reductions
in class size at the state level are likely to yield disappointing results, as was the case
in California and Florida. Second, CSR policies pursued at the school or district
level may produce larger effects in some circumstances. The Tennessee STAR
reduction of class size in the very early grades produced the largest class-size effects.
But even if reducing class sizes produces benefits this large, is it worth the cost?
The costs of class-size reduction are as certain as the benefits are uncertain.
Reducing class size means hiring more teachers and building more classrooms—
the public school system’s two primary costs. Reducing class size by one-third,
from 24 students to 16 students, requires hiring 50 percent more teachers.
Depending on how much extra space schools have, new facilities may need to be
built to accommodate the additional classes. And if small classrooms are built to
accommodate small classes, schools may be stuck with small classes in the future
even if they decide they are not cost-effective.
Consider this example: A school that pays teachers $50,000 per year (roughly the
national average) would save $833 per student in teacher salary costs alone by
increasing class size from 15 to 20.30 The true savings, including facilities costs and
teacher benefits, would be significantly larger. These resources could be used for
other purposes. If all of the savings were used to raise teacher salaries, for example,
the average teacher salary in this example would increase by $17,000 to $67,000.
School finances are—and always will be—finite, so the right way to think about
every dollar spent is not “will it have any positive effect?” but “is this the best pos-
sible way to spend this dollar?”31 A hugely expensive policy has to produce very
impressive results in order for it to be preferable to all of the other potential uses
for those resources. Class-size reduction almost always fails this test because it is
too expensive to justify even benefits as large as those suggested by the Tennessee
STAR study.
Some advocates of class-size reduction argue that a policy that produces any
benefit is worth the cost, but that is only true if there are no alternative policies.
There are many educational policies that deserve to be carefully considered in
terms of their benefits and costs. The emerging consensus that teacher effective-
ness is the single most important in-school determinant of student achievement
suggests that teacher recruitment, retention, and compensation policies ought to
rank high on the list.
But an even better approach would be to let individual schools use small classes as
a response to very specific circumstances. An individual principal may decide, for
example, that a smaller class makes sense for an inexperienced teacher who needs
support in developing skills managing a classroom with several students with
behavior problems. At the same time, the principal may want to assign a larger
class to a highly effective veteran teacher, perhaps with some extra pay to compen-
sate the teacher for the extra work required. Of course, principals would need to
be given the flexibility to make such carefully considered arrangements.
10 Center for American Progress | The False Promise of Class-Size Reduction
student to learn at her own pace would be highly individualized yet could be pro-
vided to large numbers of students at a fraction of the cost of a small class.
The idea is not that virtual education could replace traditional education, but that
better outcomes might be achieved without an increase in costs through a com-
bination of software-driven and traditional instructional methods. Each method
(or a combination of the two) might be used in contexts where it is most advanta-
geous. For example, perhaps some math content could be taught to 30 students
in a computer lab with a teacher providing help as needed, which would free up
resources to teach reading skills in the traditional way in a smaller class.
American Federation of Teachers. 2007. “Survey and Hess, Frederick M., and Jon Fullerton. 2009. “Balanced
Analysis of Teacher Salary Trends 2005.” Washington. Scorecards and Management Data.” Working paper.
Center for Education Policy Research.
American Federation of Teachers. 2009. “Survey and
Analysis of Teacher Salary Trends 2007.” Washington. Hill, Carolyn J., and others. 2008. “Empirical Benchmarks
for Interpreting Effect Sizes in Research.” Child Develop-
Angrist, Joshua D., and Victor Lavy. 1999. “Using Mai- ment Perspectives, 2(3): 172–177.
monides’ Rule to Estimate the Effect of Class Size on
Scholastic Achievement.” Quarterly Journal of Economics, Hoxby, Caroline M. 2000. “The Effects of Class Size on
114(2): 533–575. Student Achievement: New Evidence from Popula-
tion Variation.” Quarterly Journal of Economics, 115(4):
Chait, Robin, and Raegen Miller. 2010. “Treating Dif- 1239–1285.
ferent Teachers Differently.” Washington: Center for
American Progress. Jepsen, Christopher, and Steven Rivkin. 2009. “Class Size
Reduction and Student Achievement: The Potential
Chingos, Matthew M. 2010. “The Impact of a Universal Tradeoff between Teacher Quality and Class Size.” Jour-
Class-Size Reduction Policy: Evidence from Florida’s nal of Human Resources, 44(1): 223–250.
Statewide Mandate.” Program on Education Policy and
Governance Working Paper 10-03. Krueger, Alan B. 1999. “Experimental Estimates of
Education Production Functions.” Quarterly Journal of
Education Commission of the States. 2005. “State Class- Economics, 115(2): 497–532.
Size Reduction Measures.” Denver.
———. 2003. “Economic Considerations and Class Size.”
Dee, Thomas S., and Martin R. West. “The Non-Cognitive Economic Journal, 113(485): F34–F63.
Returns to Class Size.” Education Evaluation and Policy
Analysis. Forthcoming. Millsap, Mary Ann, and others. 2004. “A Descriptive Eval-
uation of the Federal Class-Size Reduction Program:
Hanushek, Eric A. 1999. “The Evidence on Class Size.” In Final Report.” U.S. Department of Education.
Susan E. Mayer and Paul E. Peterson, ed., Earning &
Learning: How Schools Matter. Washington: Brookings Sims, David. 2008. “A Strategic Response to Class Size
Institution Press. Reduction: Combination Classes and Student Achieve-
ment in California.” Journal of Policy Analysis and
———. 2003. “The Failure of Input-Based Schooling Poli- Management, 27(3): 457–478.
cies.” Economic Journal, 113(485): F64–F98.
———. 2009. “Crowding Peter to Educate Paul: Lessons
———. 2010. “The Economic Value of Higher Teacher from a Class Size Reduction Externality.” Economics of
Quality.” National Bureau of Economic Research Work- Education Review, 28: 465–473.
ing Paper No. 16606.
Wößmann, Ludger, and Martin West. 2006. “Class-Size
Hanushek, Eric A., and Steven G. Rivkin. 2010. “General- Effects in School Systems Around the World: Evidence
izations about Using Value-Added Measures of Teacher from Between-Grade Variation in TIMSS.” European
Quality.” American Economic Review, 100(2): 267–71. Economic Review, 50(3): 695–736.
12 Center for American Progress | The False Promise of Class-Size Reduction
Endnotes
1 Education Next, Program on Education Policy and Governance 2007 16 The effect size after one year was about 0.2 standard deviations,
Survey. Results available at http://educationnext.org/files/EN-PEPG_ which is converted to years of learning using the average annual
Complete_Polling_Results.pdf. gain in effect size from kindergarten to first grade (averaged for
math and reading). Reported in Carolyn J. Hill and others, “Empirical
2 Education Commission of the States, “State Class-Size Reduction Benchmarks for Interpreting Effect Sizes in Research,” Child Develop-
Measures” (2005). ment Perspectives, 2(3) (2008): 172–177.
3 Florida Department of Education, “2009–10 Florida Education 17 Author’s calculations from the Project STAR data. The effect size
Finance Program,” Florida DOE Information Database Workshop, after three years for students who entered in kindergarten was 0.14
Summer 2009, available at http://www.fldoe.org/eias/databas- standard deviations. Dividing this effect size by the first four aver-
eworkshop/ppt/fefp.ppt. age annual gains in effect size reported by Hill and others (2008)
indicates that it corresponds to about 16 percent of a year of learn-
4 Mary Ann Millsap and others, “A Descriptive Evaluation of the Federal ing (i.e., the effect size fell in terms of standard deviations of student
Class-Size Reduction Program: Final Report,” U.S. Department of test scores, but not in terms of years of learning). Test scores are
Education (2004). the average of standardized (mean zero, standard deviation one)
reading and math scores. Estimated effects, which are based on the
5 The pupil-teacher ratio is not equivalent to average class size, but students who entered the experiment in kindergarten, control for
the two are related and only pupil-teacher ratio has been recorded school fixed effects and student race, gender, and free lunch receipt.
for the entire country over a reasonably long period of time.
18 For a detailed discussion of this issue, see Eric A. Hanushek, “The
6 The federal law is the Education for All Handicapped Children Act Evidence on Class Size,” in Susan E. Mayer and Paul E. Peterson, ed.,
of 1975, which was revised and renamed the Individuals with Dis- Earning & Learning: How Schools Matter (Washington: Brookings
abilities Education Act in 1990. Institution Press, 1999).
7 For additional evidence that changes in special education do not 19 The four-year effect sizes (in standard deviations) for the students
explain more than a modest share of the fall in the pupil-teacher that entered the study in kindergarten are 0.23 for black students,
ratio, see Eric A. Hanushek, “The Evidence on Class Size,” in Susan 0.13 for white students, 0.17 for students eligible for the federal free
E. Mayer and Paul E. Peterson, ed., Earning & Learning: How Schools or reduced-price lunch program, 0.14 for students not eligible for
Matter (Washington: Brookings Institution Press, 1999). this problem, 0.03 for girls, and 0.25 for boys (author’s calculations
from the Project STAR data using third-grade reading and math test
8 For examples of high-quality international evidence, see Joshua D. scores). The effects (in standard deviations) on third-grade scores
Angrist and Victor Lavy, “Using Maimonides’ Rule to Estimate the for all students in the STAR experiment who took the third-grade
Effect of Class Size on Scholastic Achievement,” Quarterly Journal test, including later entrants, were 0.15 for all students, 0.24 for
of Economics, 114(2) (1999): 533–575; and Ludger Wößmann and blacks, 0.13 for whites, 0.21 for disadvantaged students, 0.10 for
Martin West, “Class-Size Effects in School Systems Around the nondisadvantaged students, 0.11 for girls, and 0.18 for boys. These
World: Evidence from Between-Grade Variation in TIMSS,” European results also control for school-by-entry-wave fixed effects (the level
Economic Review, 50(3) (2006): 695–736. at which randomization occurred).
9 For a summary of several of these studies, see Eric A. Hanushek 20 Caroline M. Hoxby, “The Effects of Class Size on Student Achieve-
and Steven G. Rivkin, “Generalizations about Using Value-Added ment: New Evidence from Population Variation,” Quarterly Journal of
Measures of Teacher Quality,” American Economic Review, 100(2) Economics, 115(4) (2000): 1239–1285.
(2010): 267–71.
21 American Federation of Teachers, “Survey and Analysis of Teacher
10 Eric A. Hanushek, “The Economic Value of Higher Teacher Quality,” Na- Salary Trends 2005” (2007).
tional Bureau of Economic Research Working Paper No. 16606 (2010).
22 This idea is discussed in an international context by Ludger
11 Robin Chait and Raegen Miller, “Treating Different Teachers Differ- Wößmann and Martin West, “Class-Size Effects in School Systems
ently” (Washington: Center for American Progress, 2010). Around the World: Evidence from Between-Grade Variation in TIMSS,”
European Economic Review, 50(3) (2006): 695–736.
12 Eric A. Hanushek, “The Failure of Input-Based Schooling Policies,”
Economic Journal, 113(485) (2003): F64–F98. 23 Alan B. Krueger, “Experimental Estimates of Education Production
Functions,” Quarterly Journal of Economics, 115(2) (1999): 497–532.
13 Alan B. Krueger, “Economic Considerations and Class Size,” Economic
Journal, 113(485) (2003): F34–F63. 24 Eric A. Hanushek, “The Failure of Input-Based Schooling Policies,”
Economic Journal, 113(485) (2003): F64–F98.
14 The seminal analysis of Project STAR is Alan B. Krueger, “Experimen-
tal Estimates of Education Production Functions,” Quarterly Journal 25 Thomas S. Dee and Martin R. West, “The Non-Cognitive Returns to
of Economics, 115(2) (1999): 497–532. Class Size,” Education Evaluation and Policy Analysis (forthcoming).
15 A third group was assigned to a regular-size class with a teacher’s 26 Christopher Jepsen and Steven Rivkin, “Class Size Reduction and Stu-
aide. Having an aide in the classroom had no impact on student dent Achievement: The Potential Tradeoff between Teacher Quality
achievement. and Class Size,” Journal of Human Resources, 44(1) (2009): 223–250.
Endnotes | www.americanprogress.org 13
The effect sizes in school-level standard deviations in their preferred 33 Specifically, they found that having a teacher who is one standard
model (columns 4 and 9 of Table 3) ranged from 0.037 to 0.072 in deviation above average (as compared to the average teacher)
grades 2 and 3. These effect sizes are converted to years of learning increases test scores by 0.11 and 0.15 standard deviations in read-
using the grade- and subject-specific average annual gains in effect ing and math, respectively. I convert these estimates to years of
size reported in Hill and others (2008). However, the converted learning (after adjusting for the fact that the 75th percentile is about
figures probably overstate the amount of student learning because 0.7 standard deviations above the mean) by dividing by the aver-
school-level standard deviations tend to be substantially smaller age of the average annual gains in effect size from grades 4 to 8 (the
than student-level standard deviations. grades often covered by value-added studies) reported in Hill and
others (2008).
27 David Sims, “A Strategic Response to Class Size Reduction: Combina-
tion Classes and Student Achievement in California,” Journal of Policy 34 Eric A. Hanushek, “The Economic Value of Higher Teacher Quality,” Na-
Analysis and Management, 27(3) (2008): 457–478, and David Sims, tional Bureau of Economic Research Working Paper No. 16606 (2010).
“Crowding Peter to Educate Paul: Lessons from a Class Size Reduc-
tion Externality,” Economics of Education Review, 28 (2009): 465–473. 35 Robin Chait and Raegen Miller, “Treating Different Teachers Differ-
ently” (Washington: Center for American Progress, 2010).
28 The maximum class sizes in Florida are 18 in grades PK–3, 22 in
grades 4–8, and 25 in grades 9–12. 36 But perhaps citizens would feel differently if they understood the
costs of smaller classes and the tradeoffs with other policies that
29 Matthew M. Chingos, “The Impact of a Universal Class-Size Reduction they favor—tradeoffs that are particularly salient in the current
Policy: Evidence from Florida’s Statewide Mandate,” Program on economic climate. This past November, 55 percent of voters in
Education Policy and Governance Working Paper 10-03 (2010). Florida voted to modestly scale back their CSR policy (but were
unable to because the threshold to amend the state constitution
30 The average teacher salary in the United States is $51,049 (American is 60 percent).
Federation of Teachers, “Survey and Analysis of Teacher Salary
Trends 2007” (2009)). 37 State governments may also wish to impose maximum class sizes
that are based on safety considerations, such as the maximum
31 Some studies calculate cost-effectiveness by comparing the number of children that can safely fit into a classroom. Children,
estimated cost of a policy to an estimate of the policy’s effects on who are required by law to attend school, should not be placed
students’ future earnings (over their lifetimes, discounted to the with 40 other students in a classroom designed for 25 or 30, but
present). A policy that produces more in benefits than it costs is said such horror stories also should not be used as justification for
to be cost-effective, but this type of cost-benefit analysis misses the reducing class size from 25 to 23 across an entire state at a cost of
important point that those costs may have produced even larger billions of dollars.
benefits had they been spent on a different policy.
38 Frederick M. Hess and Jon Fullerton, “Balanced Scorecards and
32 Eric A. Hanushek and Steven G. Rivkin, “Generalizations about Using Management Data,” working paper, Center for Education Policy
Value-Added Measures of Teacher Quality,” American Economic Research (2009).
Review, 100(2) (2010): 267–71.
14 Center for American Progress | The False Promise of Class-Size Reduction
About the author
Acknowledgements
The Center for American Progress thanks The Bill & Melinda Gates Foundation
for generously providing support for this paper. The author would like to thank
Theodora Chang, Raegen Miller, Megan Slack, and Martin West for helpful com-
ments and suggestions.
1333 H Street, NW, 10th Floor, Washington, DC 20005 • Tel: 202-682-1611 • Fax: 202-682-1867 • www.americanprogress.org