The Effective Use of Secondary Data: Brown University
The Effective Use of Secondary Data: Brown University
The Effective Use of Secondary Data: Brown University
In primary data analysis the individuals who collect the data also analyze it; for
meta-analysis an investigator quantitatively combines the statistical results from
multiple studies of a phenomenon to reach a conclusion; in secondary data analysis
individuals who were not involved in the collection of the data analyze the data.
Secondary data analysis may be based on the published data or it may be based on
the original data. Most studies of animal cognition involve primary data analysis;
it was difficult to identify any that were based on meta-analysis; secondary data
analysis based on published data has been used effectively, and examples are given
from the research of John Gibbon on scalar timing theory. Secondary data analysis
can also be based on the original data if the original data are available in an archive.
Such an archive in the field of animal cognition is feasible and desirable. 2002
Elsevier Science (USA)
Key Words: secondary data analysis; data archives; animal cognition; primary
data analysis; meta-analysis; Gibbon; scalar timing theory.
The section of this article on secondary data analysis of original data was based on a talk
at the meeting of the Comparative Cognition Society, Melbourne, FL, March 17, 2000 and
documents prepared for a workshop on Data Archiving for Animal Cognition Research
sponsored by the National Institute of Mental Health and co-chaired by Russell M. Church
and Howard S. Kurtzman that was held in Washington, DC, on July 1920, 2001. The prepara-
tion of this article was supported by the National Institute of Mental Health Grant MH44234
to Brown University. Kimberly Kirkpatrick was a major contributor to the development of
an archive of data of timing research: http://www.Brown.edu/Research/Timelab.
Address reprint requests to Russell M. Church, Department of Psychology, Box 1853, 89
Waterman St., Brown University, Providence, RI 02912. Fax: (401) 863-1300. E-mail:
[email protected].
32
0023-9690/02 $35.00
2002 Elsevier Science (USA)
All rights reserved.
SECONDARY DATA ANALYSIS 33
data, and (6) interprets the results. The integration of the experimental design
and data collection stages with the data analysis and interpretation stages is
the hallmark of primary data analysis.
Articles based on primary data analysis may have an important influence
on further research. Any lasting impact of an article based upon primary data
analysis may be estimated by citations of it in subsequent empirical articles
and in reviews of the literature. The earlier article may be cited for its state-
ment of the problem, its methods, its published results, or its conclusions.
The subsequent articles rarely involve any further analyses of the original
data used for the published results.
In secondary data analysis, the individual or group that analyzes the
data is not involved in the planning of the experiment or the collection of
the data. Such analysis can be done based upon information that is available
in the statistical information in the published articles, the data available in
the text, tables, graphs, and appendices of the published articles, or upon the
original data.
META-ANALYSIS
Meta-analysis refers to a quantitative combination of the statistical infor-
mation from multiple studies of a given phenomenon. It provides a rigorous
way to summarize the results of these studies. An excellent description of
the approach is provided by Mullen (1989). The bases for meta-analysis were
developed by R. A. Fisher and others in the 1920s and 1930s (Fisher, 1938);
even the way to combine probabilities from independent studies was well
known to researchers (Mosteller & Bush, 1954) long before the term meta-
analysis was coined by Gene Glass (1976). The method was used to com-
bine the results of 375 controlled evaluations of psychotherapy and counsel-
ing to reach the conclusions that therapy worked and that there were few
important differences in the effectiveness of different types of therapy
(Smith & Glass, 1977). A search of PsychInfo from January 1887 through
November 2000 identified 3457 meta-analysis studiesall since 1977. Al-
though this procedure has been used extensively in other areas of psychol-
ogy, a search of PsychInfo did not identify any meta-analysis studies in Ani-
mal Learning & Behavior, Behavioural Processes, Journal of Comparative
Psychology, Journal of Experimental Psychology: Animal Behavior Pro-
cesses, Learning and Motivation, or the Quarterly Journal of Experimental
Psychology (B). Three meta-analysis studies were published in Animal Be-
haviour and one in Behaviour, but the only one related to animal cognition
and behavior was about the role of magnetoreception in human navigation
(Baker, 1987). There has also been a meta-analysis of the difference be-
tween prospective and retrospective time estimations of human participants
(Block & Zakay, 1997). One meta-analysis article was published in Journal
of the Experimental Analysis of Behavior, according to the indexing software
(McSweeney, Farmer, Dougan, & Whipple, 1986). This was an extensive
34 RUSSELL M. CHURCH
basis of one set of experiments and the superposition effect on another set
of experiments.
Gibbon (1971) derived explicit solutions for a model of the mean inter-
response and intershock time functions for several free-operant avoidance
schedules. In one figure he showed that the model provided excellent fits
to the data from five individual animals from four different experiments
(Clark & Hull, 1966; Hake, 1968; Sidman, 1953; Verhave, 1959). Although
these experiments involved different species (rats, dogs, and monkeys) and
different procedures, Gibbon demonstrated that simple quantitative functions
based on scalar timing applied to results from all of them. This analysis was
extended to the effect of amount of reduction in shock density in a theoretical
article that included data from published experiments of others (Gibbon,
1972). An unwritten message from this article is that it is not necessary to
design a specific experiment to determine whether or not a quantitative prin-
ciple is applicable. Confidence in the generality of a principle may be in-
creased by the range of studies to which it applies and the fact that the author
did not design the experiment for the purpose of illustrating the principle.
According to the Science Citation Index, this important article (Gibbon,
1971) in the Journal of Mathematical Psychology has been cited only 21
times, and only four times in the past decade. In an influential article in the
Psychological Review entitled Scalar expectancy theory and Webers law
in animal timing (Gibbon, 1977), scalar timing was applied to a wider range
of procedures. This article has been cited 358 times, and the rate of citations
36 RUSSELL M. CHURCH
FIG. 2. Three examples of relativistic timing. This figure is reprinted from Gibbon (1977).
It was based on secondary analysis of published data from Dews (1970), LaBarbera and Church
(1974), and unpublished data from Gibbons laboratory.
does not appear to be decreasing: Over a third of the citations (126) were
in the past 5 years (1996 through 2000).
The first figure in this theoretical article (Gibbon, 1977) demonstrated the
relativistic nature of timing behavior (Fig. 2). The three figures came from
unpublished research from Gibbons laboratory and two other experiments
that used quite different methods: pigeons or rats, fixed or variable interval
schedules of reinforcement, various ranges of intervals, and food or shock
reinforcers (Dews, 1970; LaBarbera & Church, 1974). The function forms
obtained in the three experiments are different (increasing or decreasing,
linear or nonlinear), and different independent and dependent variables were
used. The one regularity was that all the functions within each of the experi-
ments were essentially the same. This required that the independent variable
represent time in relative rather than in absolute units.
In other figures in this article, Gibbon (1977) showed that the mean and
standard deviation increased linearly with the fixed interval, that the coeffi-
cient of variation (the standard deviation divided by the mean) was approxi-
mately constant (Schneider, 1969; Schneider & Neuringer, 1972) in a fixed
time schedule (Killeen, 1975), that the mean time to the second response in
a progressive interval schedule was linearly related to the interval schedule
on a log-log scale (Harzem, 1969), that the function relating normalized ac-
tivity to relative time was the same at three different intervals between food
presentations (Killeen, 1975), and that the function relating probability of a
short response to relative time was the same for various ranges of time
intervals (Stubbs, 1968).
In the application of scalar expectancy theory to choice, Gibbon (1977)
SECONDARY DATA ANALYSIS 37
used data from Chung (1965), Chung and Herrnstein (1967), Duncan and
Fantino (1970), Killeen (1970), Moffitt and Shimp (1971), Rachlin and
Green (1972), Shimp (1968, 1969), and Staddon (1968) to compare the pre-
dictions of scalar expectancy theory with matching. The quantitative predic-
tions of scalar expectancy theory were excellent in all five of the figures;
the matching predictions were not. Gibbons 1997 article provided broad
support for the general principles of scalar expectancy theory and for the
application of Webers law in animal timing.
The location of the data points in the published articles were carefully
measured by enlarging the figures photographically to 8 10 inches, and
using a Gerber scientific GraphAnalogue ruler (Model GA-103) to measure
the location within 1% of the data values (Gibbon, 1977, footnote 2).1 Other
approaches are to make an overhead transparency of a graph and project it
on large graph paper on a wall, or to use a copy machine to enlarge a graph
and enter the observed values in a spreadsheet (in millimeters), calculate the
slope and intercept of the scale transforms, and transform the units from
lengths to units of the dependent variable. A modern approach is to use a
scanner to digitize a graph that can be saved in an appropriate format and
then use software (such as DataThief, from www.shareware.com) to deter-
mine the location of the points. These procedures provide ways to obtain
reasonably accurate estimates of the values shown in graphs. The main limi-
tation is that the dependent variables displayed in most graphs represent
highly summarized information and not the original data. In some cases,
such as Killeen (1975), John Gibbon requested and received copies of origi-
nal data sheets and documentation.
In many of his studies of avoidance of shock, autoshaping, and interval
schedules of positive reinforcement, and choice, Gibbon supported the gener-
ality of conclusions by using published quantitative data from other experi-
ments. The figure that included data from the largest number of other experi-
ments showed that the number of reinforcers to acquisition of an autoshaped
response by pigeons was linearly related to the ratio of the intertrial duration
1
Information about the Gerber GraphAnalogue is available at http://www.gerberscientific.
com. It includes the following story from 1945:
While studying aeronautical engineering at Renselaeer Polytechnic Institute, H. Jo-
seph Gerber was frequently required to solve intricate and time-consuming mathe-
matical problems, many involving the plotting of points. One night, in an effort to
reduce the tedious and repetitive nature of his homework assignments, he created
an expandable ruler from the elastic waistband of his pajama bottomsthereby
inventing a new method of scaling distances between points. An instant sensation
on campus, his invention was eventually modified to incorporate a unique triangular
spring . . . giving birth to the Gerber Variable Scale . This handheld device, hailed
as the greatest invention since the slide rule, is still in production todaymore than
50 years after the factwith thousands in use around the world. And, it is on perma-
nent display at the Smithsonian Institutions Museum of American History.
38 RUSSELL M. CHURCH
to trial duration on a log-log scale (Gibbon & Balsam, 1981) (see Fig. 3).
In autoshaping, food may be presented at some interval (I) after the previous
food, and the stimulus may begin at some interval (T) prior to the food. For
example, I may be 100 s, and T may be 10 s. This makes the I/T ratio 10,
a ratio that produces a moderate acquisition score.
In his description of the historical and causal origins of scalar timing the-
ory, Gibbon (1991) replotted data from a well-known study of fixed interval
responding that demonstrated that the response rate on individual intervals
abruptly changed from a low rate to a high rate and that this break point
varied from one interval to another (Schneider, 1969). The top panel of Fig-
ure 4 shows the conventional representation of mean response rate as a func-
tion of time; the bottom panel shows that both the mean and the standard
deviation of the breakpoint increased linearly with the interval duration, al-
though at different slopes. The linear functions accounted for about 99% of
the variance. This is one of many cases in which Gibbons analysis went
beyond the original published report. Schneider had demonstrated the linear-
ity of the mean of the breakpoint as a function of the fixed interval, and he
provided enough data for Gibbon to extend this to demonstrate the linearity
of the standard deviation.
John Gibbon made extensive use of secondary data analysis in the analysis
SECONDARY DATA ANALYSIS 39
FIG. 4. Response rate as a function of time since food delivery on a fixed interval schedule
of reinforcement (top); mean and standard deviation of the break point as a function of the
fixed interval (bottom). The figure is reprinted from Gibbon (1991). It was based on secondary
analysis from published data from Schneider (1969).
volved secondary data analysis of published data; the others involved pri-
mary data analysis, or a specification of model or procedure. The most
important feature of secondary data analysis is that the actual quantitative
results of published research is taken seriously. This can be used to establish
the generality of a quantitative function, and it can also be used to identify
problems of theoretical interest. The close examination of the quantitative
results of published studies has often facilitated the development and testing
of new quantitative theory: An excellent example of this is Killeens article
in this issue.
SECONDARY ANALYSIS OF ORIGINAL DATA
For a primary analysis, the investigator must select particular summary
measures to report in the text, figures, tables, and short appendices. The deci-
sion about how to summarize the data is an important one, because it is
irreversible. It is seldom possible to regenerate the original data from the
summaries and, therefore, the published summary measures of performance
can only rarely be used to examine alternative measures of performance. For
example, a study of classical conditioning that reports absolute or relative
rates (or probabilities) of responding in the presence and absence of a stimu-
lus cannot be used to evaluate a real-time theory of conditioning because
the time of responding has been eliminated from the record.
The major limitation of secondary data analysis of the published data is
that, due to the constraints of space in paper journals, the publication includes
only a summarized version of the original data. There is considerable infor-
mation in the original data that cannot be recovered from the summary mea-
sures reported in the published article. If the original data were available,
this limitation of secondary data analysis would disappear. Some journals,
such as the Journal of the Experimental Analysis of Behavior, have a long
record of publishing large tables of data, but most rarely have space for more
than a few highly summarized tables.
Figures and highly summarized tables seldom provide the necessary infor-
mation for the analysis of alternative dependent variables or the examination
of different problems. For example, Crystal, Church, and Broadbent (1997)
reported results of several experiments to examine systematic nonlinearities
in rats memory for time. The published analyses based upon a particular
dependent variable, the start time, could not be used by an analyst who
wanted to examine a different dependent variable, such as time of median
response. From the original data, however, it is possible to examine this
dependent variable and a different problem, such as the scalar nature of the
behavior. Figure 5 shows that both the mean and the standard deviation of
the time of the median response of rats in 66 different interval durations
between 10 and 140 s is approximately linear; therefore, the ratio of the
standard deviation to the mean (the coefficient of variation) would also be
approximately linear. This is an illustration of the use of data from an experi-
SECONDARY DATA ANALYSIS 41
FIG. 5. Mean and standard deviation of the time of the median response as a function
of previous interval duration. This figure is based on secondary data analysis of the original
data from Group 2 of Experiment 2 of Crystal, Church, and Broadbent (1997) on the web
site http:/ /www.brown.edu/Research/Timelab.
ment designed for the study of one topic (systematic nonlinearities in timing)
for obtaining evidence on a different topic (Webers law for timing).
Every year research psychologists collect an enormous amount of data.
They summarize it, they attempt to explain it, and they publish articles based
on the research. The original data may be in notebooks, in stacks of data
forms and, increasingly, in computer files. Various guidelines for data reten-
tion and data sharing have been made by journal editors, by the ethical guide-
lines for members of the American Psychological Association, and others.
Many psychologists maintain these records for five years or more, but it
currently is difficult to analyze them further: The details of the data formats
may be difficult to reconstruct, and they may be stored on media that are
no longer standard. Some investigators are willing and able to make their
original data available upon request to others, but, for all practical purposes,
after publication most original data are inaccessible. At present, any influ-
ence of a research project must be due to the features that can be published,
such as the procedure, the summarized data, or the conclusions; it cannot be
based on the original data.
Many experimenters currently record extensive data from experiments,
and these records may be saved on removable disks or other storage media.
42 RUSSELL M. CHURCH
For example, the experimenter may automatically record the times of occur-
rence of several stimuli and responses during many sessions. These are more
complete than more summarized measures which do not record the specific
times of occurrence of each measured event, and these data provide more
information than is usually available in the published articles. For example,
they would provide information about each animal on each session. Second-
ary data analysis could be done on the original data, if the original data were
available.
It is now feasible to make original data available on the internet. At present
some investigators have made data available on personal websites, and this
is reasonably easy to do. The original data from many experiments can be
represented as the times of occurrence of different events. These can be rep-
resented in text format in a table with many rows and two columns. The
original data from other experiments can be represented as numbers or words
in text format in tables with many rows and more than two columns. For
some purposes, such as archiving of stimuli, other formats are required. Sci-
entific societies or other organizations could provide an important service by
facilitating the development of standard format conventions and by providing
storage space for data that could be mirrored at other sites.
The archive should be easy to use: It should be well organized and com-
pact; it should use standard formats; and the software should be written in
a way to guarantee fast, reliable transmission. Most critically, each data set
in the archive should be accompanied by good documentation that specifies
completely and in a standardized form the experimental procedures and the
codes used. Sole reliance on personal archives is not a sufficient long-term
solution to providing permanent and wide access to original data. A useful
archive of data should contain data from research on a topic that is important
and focused (e.g., animal cognition), but that is broader than the research of
a single laboratory.
A data archive should contain a large quantity of high-quality data with
both hierarchical organization of subtopics and features to facilitate searches.
An archive of data would be particularly useful if it contained various forms
of assistance, such as references to the published literature, links to other
relevant sites, data analysis tools, theory tools, and technical support. Data
sets linked to full-text versions of published articles that have gone through
the peer review process are particularly valuable, especially if the full-text
versions are widely available.
A major concern is that investigators will not put their data in archives.
They may consider the effort of putting the data into a standard format to
outweigh the value of the general accessibility of the data. Many articles are
rarely cited, and may be rarely read; many data sets may never be used for
secondary data analysis. Some investigators may also be concerned that er-
rors will be exposed or that others will publish articles based upon their data
or will propose alternative explanations of their data. Examples in which
SECONDARY DATA ANALYSIS 43