Chen 1983
Chen 1983
http://erx.sagepub.com/
Published by:
http://www.sagepublications.com
Additional services and information for Evaluation Review can be found at:
Subscriptions: http://erx.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://erx.sagepub.com/content/7/3/283.refs.html
What is This?
AUTHORS’ NOTE: This article is a revised version of a paper presented at the 1982
meeting of the Evaluation Research Society. Preparation of this article was aided by a
grant from the National Science Foundation (Grant SES-8121745), of which P. H. Rossi
is the Principal Investigator.
283
evaluation research we have called elsewhere (Chen and Rossi, 1980) the
&dquo;theory-driven&dquo; approach to evaluation-a perspective, we believe, that
has promise to yield better information on social programs, as well as
rich yields to the basic social science disciplines.
Of course the kind of theory we have in mind is not the global
conceptual schemes of the grand theorists, but much more prosaic
theories that are concerned with how human organizations work and
how social problems are generated. It advances evaluation practice very
little to adopt one or another of current global theories in attacking, say,
the problem of juvenile delinquency, but it does help a great deal to
understand the authority structure in schools and the mechanisms of
peer group influence and parental discipline in designing and evaluating
a program that is supposed to reduce disciplinary problems in schools.
Nor are we advocating an approach that rests exclusively on proven
theoretical schema that have received wide acclaim in published social
science literatures. What we are strongly advocating is the necessity for
theorizing, for constructing plausible and defensible models of how
programs can be expected to work before evaluating them. Indeed the
theory-driven perspective is closer to what econometricians call &dquo;model
specification&dquo; than are more complicated and more abstract and general
theories.
Nor do we argue for uncritically using the theories that may underlie
policymakers’ and program designers’ views of how programs should
work. Often enough policymakers and program designers are not social
scientists and their theories (if any) are likely to be simply the current
folklore of the upper-middle-brow media. The primary criterion for
identifying theory in the sense used in this article is consistency with
social science knowledge and theory. Indeed theoretical structures
constructed out of social science concerns may directly contradict what
may be the working assumptions of policymakers and program
designers.
It is an acknowledged embarrassment to our viewpoint that social
science theory is not well enough developed that appropriate theoretical
frameworks and schema are ordinarily easily available &dquo;off the shelf.&dquo;
But the absence of fully developed theory should not prevent one from
using the best of what is already at hand. Most important of all, it is
necessary to think theoretically, that is, to rise above the specific and the
particular to develop general understandings of social phenomena.
A GENERALIZED MODEL
FOR PROGRAM EVALUATION
Y
¡¡:
287
OUTCOME VARIABLES:
SPECIFYING THE GOALS *
OF PROGRAMS
For example, although the major goals of the 1968 Federal Firearms
Regulation Act were specific enough (i.e., to restrict access to firearms
on the part of felons, the insane, and certain other categories of persons),
the mode of implementation, namely, requiring the registration of gun
dealers and forbidding them to sell to the proscribed social categories,
was bound to fail since it was based on the assumptions that gun-using
criminals obtained their weapons through gun dealers, that gun dealers
could discern which of their potential customers fit into the proscribed
categories, and that gun registration records could be easily accessed to
trace gun ownership. None of these assumptions was tenable and some
went quite contrary to existing established knowledge (Wright et al.,
1983).
Theory-Derived Goals,
Not Specified by Policy
light of existing social theory and knowledge. We turn to that task in the
next two sections of this article.
SPECIFYING THE
TREATMENT MODEL
But black box randomized experiments are not the only realization of
the experimental paradigm and, indeed, may often be an inefficient
form of that paradigm. This arises because advocates of the black box
experimental paradigm often neglect the fact that after randomization
exogenous variables are still correlated with outcome variables. Knowing
how such exogenous factors affect outcomes makes it possible to
construct more precise estimates of experimental effects by controlling
for such exogenous variables. For example, an experiment on the
recidivism of released prisoners can estimate treatment effects with
smaller standard errors by taking into account the fact that age,
education, and previous work experiences of the released prisoners
ordinarily affect tendencies to recidivate. For a given N, a randomized
experiment that takes into account existing theory and knowledge can3
have considerably more power than a black box randomized experiment.3
The black box paradigm also dominates classical discussion of
nonexperimental approaches (Campbell and Stanley, 1966). Such
discussions center around what are the inherent dangers of black box
quasi-experimental approaches. This may be appropriate if one is
estimating the effectiveness of a program for which there is no
underlying sensible rationale, but it is not sensible to ignore existing
knowledge when its use can increase the power of the research design.
Indeed, at best, it may be possible to obtain unbiased estimates of
effects from quasi-experimental approaches if one can model with some
degree of accuracy the relationships among all the elements of the
treatment model. For example, an evaluation of an unemployment
insurance program in California (Rauma and Berk, 1982) was able to
control for the exogenous factors that determined the size of a released
prisoner’s benefit eligibility, because such benefits were completely
determined by the number of days worked while in prison.4 By holding
constant the number of days worked while in prison, it was possible to
hold constant the exogenous factors that determined receiving the
treatment and hence to construct unbiased estimates of the effects of the
treatment on subsequent recidivism.
The general issue of controlling for self-selection bias has been
discussed thoroughly in recent literature (Barnow et al., 1980) and more
recently by Berk and Ray (1982). How successfully such approaches can
be applied in particular cases is determined by how well known are the
exogenous processes and how well they can be measured. Furthermore,
it is somewhat obvious, but bears emphasis, that knowledge and theory
uncover them.
One of the main benefits of departing from the black box treatment-
as-unit approach to evaluation is an enhanced ability to generalize from
the researches in question to other circumstances. The end result of a
black box evaluation is to know whether or not a given treatment-as-
unit is effective and to what extent it is so. A transfer into a different
administrative environment and subsequent modifications to fit the
requirements of that environment may drastically alter the treatment’s
effectiveness, if the elements changed are among the more important
within the treatment-as-unit. Indeed, since the translation of a proposed
program into an enacted program always requires modification to fit the
administrative environment into which it is placed, as well as to the
political acceptability constraints of the policymakers, it is important to
be able to point out what are the essential and nonessential components
of a proposed program.
The main points made with respect to the modeling of the treatment
processes and components of delivered treatments apply as well as to the
modeling of intervening processes. Indeed any model of the treatment
process necessarily includes modeling intervening processes. From
some viewpoints it hardly makes any sense to distinguish intervening
processes except that, for programs that may be expected to have very
long time effects, whether or not intervening processes occur may be the
first sign of whether or not a program is working. For example, if a
manpower training program is to be installed to increase the earning
power of participants over the long run, it may be useful as a first step to
specify what has to change in the short run in order that the long-range
effects of the desired sort may be eventually captured. Thus, if a training
program does not increase the job-relevant skills of participants, it
seems unlikely that long-run wages will also increase. In short, the
Implementation Modeling
Implementation systems traditionally have not been given the
amount of attention they fully deserve in evaluation research. As
pointed out earlier, experimental evaluations of prospective programs
involve setting up arrangements for delivering programs (or treatments);
hence even programs set up for testing purposes by researchers involve
implementation systems. Even more important is the fact that a
program once enacted must be carried out through an implementation
system that includes administrative rules and regulations, bureaucratic
structures, and personnel who have been given the responsibility to
administer the program in question.
An understanding of program implementation is important in
program evaluation, since successful implementation is also a necessary
condition in assessing program theory success. Only when treatment
variables are implemented successfully, at least to some extent, can we
test whether or not the treatment variables have had any impact upon
outcome variables.
In the evaluation 1_iterature there has been no dearth of interest in
implementation, but too much of the attention has been given to
worrying about whetiier programs have be~!1 delivered as intended, and
not enough attention has been given to understanding the process of
implementation. Thus Levine ( 1972) stated that the main prob!em of the
War on Poverty was the failure of programs to be implemented in the
field. Gramlich and Koshel ( 1975) found that the performance contract-
ing experiments failed to the extent that they were not implemented (or
implemented incorrectly) in the field.
Part of the problem of integrating a concern for implementation
process into evaluation stems from the fact that evaluators tend to be
specialists in the disciplines relevant to treatment processes. Thus an
evaluator concerned with the outcome of educational programs usually
knows a great deal about educational processes, but may know very
little about theories of organization; hence the organizational contexts
of the program may be neglected or unspecified.
In Figure 1 we have designated an implementation system as the
organizational arrangement that is either specially designed to deliver
treatments (or programs) or given the responsibility to do so. We do not
mean to imply that this box represents a simple system. Indeed at least
six subsystems have been identified in the existing literature (e.g., Van
Meter and Van Horn, 1975; Williams and Elmore, 1976; Scheirer, 1981 ),
and these are detailed below. ,
persons through the mail. At the other extreme, treatments that involve
tailoring interventions to the characteristics of targets usually involve
allowing considerable discretion to the frontline implementer, a circum-
stance that may considerably distort program intentions. Indeed for
most human services programs the dilemma is what is the optimum level
of discretion to be allowed? If too little discretion is allowed, inappro-
priate treatments may be administered to clients. If too much discretion
is allowed, it may become very difficult to determine precisely what was
delivered, as is the case with many educational programs designed to
alter the teaching practices in the classroom.
Another characteristic of treatments that bears attention is the matter
of dosage. Thus a transfer payment that amounts to $100 per week is
simply worth a lot more than 100 times a transfer payment of $1 per
week. Or, just becuase we know that three hours of counseling per week
may help a client does not mean that one hour per week will simply do
one-third less. The amount of an intervention, especially as actually
delivered, ought to be an important concern of evaluators.
Resources. Obviously a program requires-sufficient resources to
enable it to accomplish the delivery of treatment. Funds are used to hire
persons, physical facilities, and so on. An underfunded program simply
will not be able to deliver the treatments as prescribed.
CONCLUSIONS
NOTES
to rainfall, the very expensive experiments were not powerful enough to detect treatment
effects. A strong argument was then advanced against any additional black box
experiments on the effects of cloud seeding.
4. Since prisoners did not know that their working would affect their eligibility for and
amount of benefits (the legislation had been enacted at the time of imprisonment), they
could not have worked in prison because they anticipated postrelease benefits. Of course,
for subsequent cohorts of prisoners, the possibility of prison work being affected by
anticipated benefits will have to be taken into account.
REFERENCES
LEVINE, R. A. (1972) Public Planning: Failure and Redirections. New York: Basic
Books.
MIELKE, K. W. and J. W. SWINEHART (1976) Evaluation of the Feeling Good
Television Series. New York: Children’s Television Workshop.
MORRIS, L. L., C. T. FITZ-GIBBON, and M. E. HENERSON (1978) Program
Evaluation Kit. Beverly Hills, CA: Sage.
MOYNIHAN, D. P. (1969) Maximum Feasible Misunderstanding. New York: Free
Press.
RAUMA, D. and R. BERK (1982) "Crime and poverty in California." Social Science
Research 11, 4: 318-351.
RIECKEN, H. W. and R. F. BORUCH (1974) Social Experimentation-A Method for
Planning and Evaluating Social Intervention. New York: Academic.
ROSSI, P. H. and K. LYALL (1976) Reforming Social Welfare. New York: Russell Sage.
ROSSI, P. H., R. A. BERK, and K. J. LENIHAN (1980) Money, Work and Crime. New
York: Academic.
SCHEIRER, M. A. (1981) Program Implementation: The Organizational Context.
Beverly Hills, CA: Sage.
SCRIVEN, M. (1972) "Pros and cons about goal-free evaluation." Evaluation Comment
3: 1-4.
SUCH MAN, E. A. (1969) "Evaluating educational programs." Urban Review 3, 4 : 15-17.
VAN METER, D. S. and C. E. VAN HORN (1975) "The policy implementation process:
a conceptual framework." Administration and Society 6, 4: 445-488.
WATTS, H. and A. REES (1976) The New Jersey Income Maintenance Experiment,
Vols. 2 and 3. New York: Academic.
WHOLEY, J. S., J. N. NAY, and R. E. SCHMIDT (1975) "Evaluation: where is it really
needed?" Evaluation Magazine 2, 2: 89-93.
WILLIAMS, W. (1976) "Implementation analysis and assessment," in W. Williams and
R. F. Elmore (eds.) Social Program Implementation. New York: Academic.
and R. F. ELMORE [eds.] (1976) Social Program Implementation. New York:
———
Academic.
WRIGHT, J. D., P. H. ROSSI, and K. DALY (1983) Under the Gun. Hawthorne, NY:
Aldine.
Huey-tsyh Chen is Associate Research Scientist of the Center for Metropolitan Planning
and Research and an Assistant Professor of the Department of Social Relations at the
Johns Hopkins University. He is interested in developing theoretical models in program
evaluation and implementationfor policy decisions. Currently he is editing an evaluation
book with Peter H. Rossi.
Peter H. Rossi is Professor of Sociology and Director of Research at the Social and
Demographic Research Institute of the University of Massachusetts at Amherst. He is
coauthor (with Howard Freeman) of Evaluation: A Systematic Approach (Sage, 1982)
and (with S. Nock) of Measuring Social Judgments: The Factorial Survey Approach
(Sage, 1982). He is past president of the American Sociological Association and recipient
of the 1981 Alva and Gunnar Myrdal Prize of the Evaluation Research Society, awarded
for contributions to evaluation research methods.