OBF Flashcards 201402
OBF Flashcards 201402
OBF Flashcards 201402
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
contents
Introducing the Evaluation Flash Cards
As part of our ongoing work to strengthen our
support for communities, the trustees and staff of
the Otto Bremer Foundation engaged in a series of
learning seminars on evaluation. In order to make
the core concepts easily accessible and retrievable,
we asked Michael Quinn Patton, who led these
seminars, to create a set of basic reference cards.
These became the Evaluation Flash Cards presented here, with the idea that a core concept can
be revisited in a flash. Illustrations of the
concepts are drawn from Otto Bremer Foundation
grants. We hope this resource is useful to other
organizations committed to understanding and
improving the results of the programs they support.
Evaluative Thinking
2 Evaluation Questions
3 Logic Models
4 Theory of Change
5 Evaluation vs. Research
6 Dosage
7 Disaggregation
8 Changing Denominators, Changing Rates
9 SMART Goals
15 Accountability Evaluation
16 Formative Evaluation
17 Summative Evaluation
18 Developmental Evaluation
19 The IT Question
20 Fidelity or Adaptation
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Evaluative Thinking
distinguish evaluative thinking from evaluation.
Evaluation is activity. Evaluative thinking is a way of doing business.
Evaluative thinking is systematic results-oriented
thinking about:
What results are expected,
How results can be achieved,
What evidence is needed to inform future actions and
judgments, and
How results can be improved in the future.
Evaluative thinking becomes most meaningful when it is
embedded in an organizations culture. This means that
people in the organization expect to engage with each
other in clarifying key concepts, differentiating means
and ends, thinking in terms of outcomes, examining the
quality of evidence available about effectiveness, and
supporting their opinions and judgments with evidence.
Evaluative thinking is what characterizes learning organizations. Keeping up with research and evaluation findings
becomes part of everyones job. Inquiring into the empirical basis for assertions about what works and doesnt
work becomes standard operating procedure as people
in the organization engage with each other and interact
with partners and others outside the organization. Critical
thinking and reflection are valued and reinforced.
b ot to m l i n e : Practice evaluative thinking. Like any important skill, evaluative thinking improves with
practice and reinforcement.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Evaluation Questions
begin with basic description.
Evaluation supports reality-testing, finding out what is actually going on in a program. This can then
be compared to what was intended and hoped for. But the first step is basically descriptive.e.
I keep six honest serving-men
(They taught me all I knew);
Their names are What and Why and When
And How and Where and Who.
Rudyard Kipling (1865-1936), The Elephants Child
Logic Models
models can be displayed as a series of logical and sequential
connections. each step in a logic model can be evaluated.
A logic model is a way of depicting the program intervention by specifying inputs, activities, outputs,
outcomes, and impacts in a sequential series.
Explanations of some of the terms used in logic models follow.
Inputs are resources like funding, qualified staff, participants ready to engage in the program, a place to hold the
program, and basic materials to conduct the program. These inputs, at an adequate level, are necessary precursors
to the programs activities.
Participating in program activities and processes logically precedes outputs, like completing the program or
getting a certificate of achievement.
Outputs lead to short-term participant outcomes, like a better job or improved health.
Short-term outcomes lead to longer-term impacts, like a more prosperous or healthy community.
inputs/
resources
activities/
processes
outputs/
products
short-term
outcomes
long-term
impact
Logic models are one way of answering the It question in evaluation. The logic model depicts what is being evaluated.
The primary criteria for judging a logic model are whether the linkages are logical and reasonable.
1. A
re the inputs (resources) sufficient to deliver the
proposed activities?
b ot to m l i n e : Is the proposed logic model sequence from inputs to impacts logical and reasonable?
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Theory of Change
testing a theory of change can be an important
contribution of evaluation.
A theory of change explains how to produce desired outcomes. It is explanatory. A logic model just
has to be sequential (inputs before activities, activities before outcomes), logical, and reasonable.
In contrast, a theory of change must explain why the activities produce the outcomes.
example
A program to help homeless youth move from the
streets to permanent housing proposes to:
1. Build trusting relationships with the homeless youth;
2. Work to help them feel that they can take control of
their lives, instill hope, and help them plan their own
futures; and
3. Help them complete school, both for their economic
well-being and to help them achieve a sense of
accomplishment.
This approach is based on resilience research and theory.
Resilience research and theory posits that successful
youth: (1) have at least one adult they trust and can
interact with; (2) have a sense of hope for the future;
(3) have something they feel good about that they have
accomplished; and (4) have at least some sense of control
over their lives.
b ot to m l i n e : Can a program identify a theory of change based on research and, if so, can it demonstrate
how it will translate the theory into an actual program?
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Research
Evaluation
b ot to m l i n e : Distinguish research from evaluation. Use research to inform both program and evaluation
designs.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Dosage
different degrees of intervention and engagement produce
different levels of outcomes.
Dosage effects refer to the fact that different people engage in and experience a program with different
degrees of intensity. A higher dose of engagement should be related to higher-level outcomes.
Example
Question
Data
b ot to m l i n e : Watch for and understand dosage effects. All programs have them.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Disaggregation
what works for whom in what ways with what results?
Subgroups in programs have different experiences and different outcomes. Disaggregation refers
to distinguishing the experiences and outcomes of differentsubgroups.
Example A program aims to prevent teenage pregnancies. The program typically reports aggregate results for
all teens served (ages 13-19). The reported success rate is 60 percent, which means that 60 percent
of the teens do not get pregnant during the year they are engaged in the program.
Disaggregated Data
Success rate for teens aged 16-19: 80 percent
Lesson The overall 60 percent success rate for all teens disguises the fact that the program is highly effective
with older teens and relatively ineffective with younger teens. Indeed, some outcomes are different.
The program works to help older teens maintain safe and supported independence but attempts
to get younger teens integrated into a family, either their own or a foster family. In reality, the two
subgroups constitute different approaches with different results. The disaggregated data can help
decision makers target improvements to the subgroups for whom the program is less effectiveand
learn from those that show higher levels of impact.
b ot to m l i n e : When looking at overall results for a program, ask about the disaggregated results for important
subgroups.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
To understand and interpret data on rates and performance indicators, like the participation
rate in a program, the drop-out rate, or the completion rate, pay special attention to the
denominator.
Example A local job-training program reports a 40 percent drop-out rate. The denominator for this programs
rate is based on the number who have completed the initial training and signed the program
contract. Thus, the drop-out rate is NOT based on the number who initially enroll in the program
but rather the number who enroll and complete the course and sign the contract. Half of the initial
enrollees do not reach that stage.
Illustrative Data
1. Number who enter the program from January to June: 200
11. Job retention percentage of all participants who enroll: 15 percent (30/200 = 15 percent)
Lesson Different rates have different denominators. Different denominators yield different rates. Programs
define and calculate drop-out and completion rates differently, which makes comparisons difficult.
b ot to m l i n e : Be clear about the denominator being used when rates are reported.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
SMART Goals
not all goals are created equal.
Traditionally, evaluation has been synonymous with measuring goal attainment. The most basic
evaluation question is: To what extent is the program attaining its goals? To evaluate goal attainment,
goals have to be clear enough to permit evaluation.
A clear goal has five dimensions, which form the acronym SMART:
Specific
Measurable
Achievable
Relevant
Time bound
examples
Weak goal:
This goal is vague and general (not specific). What is meant by quality of life? Howwould it be
measured? Whats the timeframe?
SMART goal: Graduates will get a job paying a living wage with benefits and keep the job for
at least a year.
The outcome is relevant (the goal is aimed at the chronically unemployed; getting and
keeping a living-wage job is relevant to both participants and society)
The goal is time bound (keep the job at least one year)
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
10
Illustrative indicators
Change in circumstances
Change in status
Change in behavior
Change in functioning
Change in attitude
Change in knowledge
b ot to m l i n e : Outcomes are the desired results; indicators are how you know about outcomes. The key is to
make sure that the indicator is a reasonable, useful, and meaningful measure of the intended participant outcome.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
11
Performance Targets
what s the bull s-eye?
A performance target specifies the level of outcome that is hoped for, expected, or intended.
What percentage of participants in employment training
will have full-time jobs six months after graduation?
40 percent? 65 percent? 80 percent? What percentage
of fathers failing to make child support payments will
be meeting their full child support obligations within six
months of intervention? 15 percent? 35 percent?
60 percent?
Setting performance targets should be based on data
about what is possible. The best basis for establishing
future performance targets is past performance. Last
year we had 65 percent success. Next year we aim for
70 percent. Lacking data on past performance, it may
be advisable to wait until baseline data have been gathered before specifying a performance target. Arbitrarily
setting performance targets without some empirical
baseline may create artificial expectations that turn out
unrealistically high or embarrassingly low. One way to
avoid arbitrariness is to seek norms for reasonable
example
Consider this outcome statement: Student achievement
test scores in reading will increase one grade level from the
beginning of first grade to the beginning of second grade.
Such a statement mixes together and potentially confuses the (1) specification of a desired outcome (better
reading) with (2) its measurement (achievement test
scores) and (3) the desired performance target (one
grade level improvement).
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Qualitative Evaluation
qualitative data come from open-ended interviews, on-site
o bservations, fieldwork, site visits, and document analysis.
12
Qualitative evaluation uses case studies, systematically collected stories, and in-depth descriptions of processes and outcomes to generate insights into what program participants experience
and what difference those experiences make.
Suppose you want to evaluate learning to read. If you
want to know how well children can read, give them a
reading test (quantitative data). If you want to know
what reading means to them, you have to talk with them
(qualitative data). Qualitative questions aim at getting
an in-depth, individualized, and contextually sensitive
understanding of reading for each child interviewed. Of
course, the actual questions asked are adapted for the
childs age, language skills, school and family situation,
and purpose of the evaluation. But regardless of the
precise wording and sequence of questions, the purpose
is to hear children talk about reading in their own words;
find out about their reading behaviors, attitudes, and
experiences; and get them to tell stories that illuminate
b ot to m l i n e : Qualitative evaluation captures and communicates the perspectives, experiences, and stories of
people in programs to understand program processes and outcomes from their viewpoint.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
13
Using multiple methods increases confidence in overlapping patterns and findings. Checking for
consistency across different data sources is called triangulation.
The term triangulation is taken from land surveying.
Knowing a single landmark only locates you somewhere
along a line in a direction from the landmark, whereas
with two landmarks you can take bearings in two directions and locate yourself at their intersection. The notion
of triangulating also works metaphorically to call to mind
the worlds strongest geometric shapethe triangle.
The logic of triangulation is based on the premise that
no single method ever adequately solves the problem of
interpreting how much the weakness of any particular
method may give a false or inadequate result. Because
different kinds of data reveal different aspects of a program, multiple methods of data collection and analysis
provide more grist for the interpretation mill. Combinations of interviewing, observation, surveys, performance
indicators, program records, and document analysis
can strengthen evaluation. Studies that use only one
method are more vulnerable to errors.
Example
A site visit to a housing development turned up statistics
on residents characteristics, diversity, and income level
as well as the needs people expressed and stories about
living in the housing development. Staff learned that to
live in this development you need to work, be in school,
or have formal volunteering occurring. An evaluation
going forward might inquire how this policy works in
practice. Statistics would reveal patterns of work, school
attendance, volunteering, and resident turnover. Openended interviews would find out how residents and staff
experience these policiesthe attitudes, knowledge,
behaviors, and feelings that affect the desired outcome
of building a vibrant residential community.
b ot to m l i n e : The evaluation ideal is: No numbers without stories; no stories without numbers. Learn what
each kind of data reveals and teaches, and how to use them together: triangulating.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
14
The most powerful, useful, and credible claims are those that are of major importance and have
strong empirical support. Claims can be important or unimportant, and the evidence for the claims
can be strong or weak. The ideal is strong evidence supporting claims of major importance.
Example of an effectiveness claim: Programs serving homeless youth are contributing significantly to reducing youth
homelessness in the Twin Cities.
A
ffect a relatively large number of people
P
rovide a sustainable solution (something that lasts
over time)
S
ave money and/or time, that is, accomplish something with less money and in less time than is usually
the case (an efficiency claim)
E
nhance quality
D
eal with a problem of great societal concern
C
laim to be new or innovative
S
how that something can actually be done about a
problemthat is, claim the problem is malleable
Involve a model or approach that could be used by
others (meaning the model or approach is clearly specified and adaptable to other situations)
importance of claims
rigor of
claims
Major
Strong
Minor
Weak
H Goal: Strong claims of major importance.
b ot to m l i n e : Review claims, carefully examining the importance of the claim and the strength of the evidence.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Accountability Evaluation
different types of evaluation serve different purposes.
15
Examples
The utility of an accountability system depends on who is held accountable, by whom, for what. Accountability is most
meaningful when those held accountable actually have the capacity to achieve the things for which they are held
accountable, within the timeframes expected.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Formative Evaluation
different types of evaluation serve different purposes.
16
Formative evaluations support program improvement. The emphasis is on forming, shaping, and
improving, thus the term formative.
Formative evaluation questions
What works and what doesnt?
What are the programs strengths and weaknesses?
Whats the feedback from participants in the program about what should be improved?
H
ow do different subgroups respondthat is, what works for whom in what ways and under what conditions?
(If one size doesnt fit all, how can the needs of different people be met?)
How can outcomes and impacts be increased?
How can costs be reduced?
How can quality be enhanced?
The emphasis in these formative questions is on improvement.
Examples
A local program aims to help victims of domestic violence get jobs and improve their lives.
Across the variety of services offered, which ones are working well
and which need improvement?
In the empowerment gatherings, what works for whom in what
ways, with what outcomes? What can be learned from feedback
to improve the empowerment gatherings?
The utility of formative evaluation depends on a willingness to distinguish strengths from weaknesses and acknowledge
what needs improvement. Grantees often fear reporting weaknesses or problems to funders. Formative evaluation
requires mutual trust and a shared commitment to learning, improving, and getting better.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Summative Evaluation
different types of evaluation serve different purposes.
17
Summative evaluations judge the overall merit, worth, and significance of a project. Theterm
summative connotes a summit (important) or summing-up judgment.
The focus is on judging whether a model is effective. Summative evaluations are used to inform decisions about whether
to expand a model, replicate it elsewhere, and/or take it to scale (make it a statewide, region-wide, or national model).
Examples
The utility of summative evaluation is the focus on informing major decisions about a models effectiveness and,
therefore, its relevance and dissemination to other communities.
b ot to m l i n e : Summative evaluation requires rigorous evidence because the stakes are high. The evaluation
data must be high quality and credible to external stakeholders interested in the model.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Developmental Evaluation
different types of evaluation serve different purposes.
18
Example
A collaboration to support homeless youth involves
several organizations, each with its own projects and
evaluations. As individual agencies, they are engaged
in accountability reporting and formative evaluation
to increase effectiveness. But the overall collaborative
initiative is just beginning to be created as the organizations work together. This is a new development. As they
collaborate on both programming for homeless youth
Example
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
The It Question
when we say it works or it doesnt work, whats the it?
The It is the program model being implementedand evaluated.
19
examples
A
local job-training program has a structured curric
ulum that aims to create a positive attitude about
undertaking employment training and taking personal
responsibility for success (not being a victim).
H
abitat for Humanity has developed a model for how
to engage volunteers and low-income people together
in building a home affordable to and owned by a lowincome family.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Fidelity or Adaptation
different approaches to disseminating models require
d ifferent evaluation approaches.
20
Two opposing approaches to implementing a model have very different evaluation implications.
The two approaches follow.
1. F idelity-focused programming and evaluation means
a national model is being implemented in a local
community and is supposed to be implemented
exactly as prescribed in the national model. Fidelityfocused program models provide best practices
and standard operating procedures that amount to
a recipe for success. A McDonalds Big Mac burger is
supposed to be the same anywhere in the world.
Is the local model faithfully and rigorously implementing the standard model as specified?
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
21
High-quality lessons are supported by multiple sources of information. Knowledge confirmed from
multiple sources increases confidence that a lesson is valid and can be used to inform decisions and
future actions.
A common problem when an idea becomes highly
popular, in this case the search for lessons learned, is
that the idea loses its substance and meaning. Anybody
who wants to glorify his or her opinion can proclaim it
a lesson learned. High-quality lessons, in contrast,
represent principles extrapolated from multiple sources
and cross-validated that inform future action. In essence,
high-quality lessons constitute validated, credible,
trustworthy, and actionable knowledge.
Example
The importance of intervening in preschool years for
healthy child development and later school success
is supported by numerous evaluations, basic research
on child development, expert knowledge, practitioner
wisdom, and child development theory. In contrast,
lessons about how to work effectively with troubled
teenagers are weak in evidence, theory, research, and
number of evaluations.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
22
The evaluation profession has adopted standards that are criteria for what constitutes a good
evaluation.
A high-quality evaluation is:
Useful
Practical
Ethical
Accurate
Accountable
example
A foundation commissions an evaluation of focus work
on youth homelessness. The first phase of the evaluation documents that:
t he targeted number of new beds and services were
added to shelters; and
t he grantees collaborated to design an evaluation of
the critical factors that lead to permanent housing and
stability for homeless youth.
The grantees and foundation staff use the Phase 1
evaluation findings to develop a proposal for Phase 2.
The foundations trustees use the evaluation findings and
b ot to m l i n e : Focus on evaluation use; dont let evaluation become just compliance reporting.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
23
The graphic below depicts the inter-relationships among these four dimensions of evaluation sense making. The three
fundamental questionsWhat? So what? Now what?are connected to the four evaluation processes of (1) analyzing
basic findings, (2) making interpretations, (3) rendering judgments, and (4) generating recommendations.
1. basic findings
what?
2. interpretations
so what?
3. judgments
now what?
4. recommendations
b ot to m l i n e : When reviewing an evaluation report, watch for distinctions between basic findings,
interpretations, judgments, and recommendationsand the logical alignment and consistency among these
elements.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
Utilization-Focused Evaluation
make attention to use the driving force behind
every decision in an evaluation.
24
Utilization-focused evaluation begins with the premise that evaluations should be judged by their
utility and actual use; therefore, evaluators should facilitate the evaluation process and design an
evaluation with careful consideration of how everything that is done, from beginning to end, will
affect use.
Use concerns how real people in the real world apply evaluation findings and experience the evaluation process.
Therefore, the focus in utilization-focused evaluation is on intended use by intended users.
Who is the evaluation for?
How is it intended to be used?
Program staff
Program director
Government policymakers
and cost-benefit, among many possibilities). Utilizationfocused evaluation is a process for making decisions
about these issues in collaboration with an identified
group of primary users, focusing on their intended uses
of evaluation.
A psychology of use undergirds and informs utilizationfocused evaluation. Intended users are more likely to
use evaluations if they understand and feel ownership of
the evaluation process and findings. They are more likely
to understand and feel ownership if they have been
actively involved. By actively involving primary intended
users, the evaluator is training users in use, preparing
the groundwork for use, and reinforcing the intended
utility of the evaluation every step along the way.
b ot to m l i n e : When reviewing an evaluation proposal or report, is it clear who is intended to use the
evaluation and for what purposes?
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
25
Examples
Single program
summative
A local job-training
program.
Meta-analysis
Results of implementing
a standardized quality
improvement and rating
system for childcare
providers in multiple
sites.
Principles-based
synthesis
Youth homelessness
work engaging programs
operated by six organizations that share common
principles and values but
operate independently.
b ot to m l i n e : Evidence-based programs must have evidence, but different kinds of evidence-based programs
make different claims. Beware simple opinions masquerading as evidence. Beliefs are beliefs. Beliefs about program
effectiveness must be evaluated to become an evidence-based program or model.
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
e x pa n d i n g o p p o r t u n i t y s t r e n g t h e n i n g c o m m u n i t y
ot to b r e m e r. o r g