0% found this document useful (0 votes)
191 views14 pages

Ethical Issues and Guidelines For Conducting Data Analysis in Psychological Research (Optional Reading) PDF

Uploaded by

Ng Xiao Fan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
191 views14 pages

Ethical Issues and Guidelines For Conducting Data Analysis in Psychological Research (Optional Reading) PDF

Uploaded by

Ng Xiao Fan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

This article was downloaded by: [Kungliga Tekniska Hogskola]

On: 04 February 2015, At: 09:19


Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Ethics & Behavior


Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/hebh20

Ethical Issues and Guidelines


for Conducting Data Analysis in
Psychological Research
a
Rachel Wasserman
a
Department of Psychology Loyola University , Chicago
Accepted author version posted online: 13 Sep 2012.Published
online: 25 Jan 2013.

To cite this article: Rachel Wasserman (2013) Ethical Issues and Guidelines for Conducting Data
Analysis in Psychological Research, Ethics & Behavior, 23:1, 3-15, DOI: 10.1080/10508422.2012.728472

To link to this article: http://dx.doi.org/10.1080/10508422.2012.728472

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
ETHICS & BEHAVIOR, 23(1), 3–15
Copyright © 2013 Taylor & Francis Group, LLC
ISSN: 1050-8422 print / 1532-7019 online
DOI: 10.1080/10508422.2012.728472

Ethical Issues and Guidelines for Conducting Data Analysis


in Psychological Research
Rachel Wasserman
Department of Psychology
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

Loyola University Chicago

Psychologists are directed by ethical guidelines in most areas of their practice. However, there are
very few guidelines for conducting data analysis in research. The aim of this article is to address the
need for more extensive ethical guidelines for researchers who are post–data collection and beginning
their data analyses. Improper data analysis is an ethical issue because it can result in publishing false
or misleading conclusions. This article includes a review of ethical implications of improper data
analysis and potential causes of unethical practices. In addition, current guidelines in psychology and
other areas (e.g., American Psychological Association and American Statistical Association Ethics
Codes) were used to inspire a list of recommendations for ethical conduct in data analysis that is
appropriate for researchers in psychology.
Keywords: ethics, data analysis, research, guidelines

The American Psychological Association (APA; 2012) charges psychologists to uphold “high
standards of ethics, conduct, education, and achievement.” The field of psychology began pri-
marily as an academic and scientific profession. Thus, analyzing data and reporting scientific
findings are an integral part of many psychologists’ work. Although the current APA Ethics Code
includes some guidelines for issues involved in research and publication, most of the specific
ethical standards focus on issues relevant to clinical work (APA, 2010). Many psychologists have
argued that more specific codes are needed that focus on research, specifically the appropriate
use of data analytic strategies (see Sterba, 2006, for a review). However, early efforts to dis-
cuss ethical issues related to research design, data analysis, and data reporting received some
backlash from the scientific community (Pomerantz, 1994, and Parkinson, 1994 in response to
Rosenthal, 1994). Rosenthal (1994) argued that research should be considered within an ethical
context and that “bad science makes for bad ethics” (p. 128). Some applauded Rosenthal’s rea-
soning but questioned whether it would hinder productivity, as researchers might be hesitant to
conduct imperfect research if it is considered unethical (Pomerantz, 1994). Others took a firmer
stance against Rosenthal’s comments, stating that scientific quality and ethical quality of research
methods should be evaluated separately (Parkinson, 1994). Based on the focus of the APA Ethics

Correspondence should be addressed to Rachel Wasserman, Department of Psychology, Coffey Hall, Loyola
University Chicago, Chicago, IL 60660. E-mail: [email protected]
4 WASSERMAN

Code, it is likely that many psychologists continue to share this belief that research methods,
including data analysis, are purely methodological rather than ethical issues.
It is generally accepted that flagrant scientific misconduct is an ethical issue. The APA
Committee on Standards in Research offered the following definition of scientific misconduct,
as agreed upon by the National Science Foundation and the U.S. Public Health Service: the “fal-
sification, fabrication, or plagiarism (FFP) in proposing, conducting, or reporting research, or
other practices that seriously deviate from those commonly accepted by the scientific commu-
nity” (Grisso et al., 1991, p 763). Such flagrant misconduct is directly addressed in the APA
Ethics Code under Standard 8.10: Reporting Research Results, which states that psychologists
should not fabricate data and that errors in published data need to be corrected and in Standard
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

8.11 which prohibits plagiarism (APA, 2010).


However, the “other practices that seriously deviate from those commonly accepted” part of
this definition is vague and too vast to provide definitions and guidelines for specific concerns
(Resnick, 2011). Leaders in the psychology community are calling for greater attention toward
“other deviations” that may be less severe but more common (e.g., Kraut, 2011), and increasing
attention has been devoted to understanding these less flagrant acts of misconduct. In a recent sur-
vey project, John, Loewenstein, and Prelec (2012) concluded that psychology researchers admit
to performing “questionable research practices” at alarmingly high rates. Other researchers are
also exploring the frequency of various types of scientific misconduct, besides FFP (De Vries,
Anderson, & Martinson, 2006; Koocher, Keith-Spiegel, Tabachnick, Sieber, & Butler, 2010;
Simmons, Nelson, & Simonsohn, 2011), and the responses of peers to these scientific wrong-
doings (Koocher & Keith-Spiegel, 2010). Considering these recent endeavors, it appears that
researchers and psychologists are increasingly concerned about the more subtle ethical issues
involved in research practices.
Although researchers may engage in “other deviations” in several areas of research (e.g., data
collection, publication, etc.), this article focuses specifically on ethical issues in data analysis.
Data analysis poses at least three special difficulties. First, researchers use different methods
to conduct data analysis, as they often disagree regarding the best method. When researchers
engage in “questionable research practices,” they usually believe that their methods are justifiable
(John et al., 2012). This belief raises the following question: How can researchers know when
a questionable method they use to perform data analyses reflects a difference of opinion or a
possible violation of ethics? Second, there are many reports of statistical misuse in the literature,
but most are published in statistical or quantitative journals (Graham, 2001). Thus, these reports
and suggestions for ethical data analytic procedures do not always reach the intended audience.
Third, when a researcher engages in a method of data analysis that is not accepted by others in
the field (whether knowingly or not), it often goes unnoticed by readers. Sterba (2006) argued
that there are two types of misconduct in data analysis: covert and overt. As she defined it, “Overt
misconduct includes improper analytic practices a researcher reports in the results section of
an article. In contrast, covert misconduct is not detectable from reading the results section of
an article” (Sterba, 2006, pp. 306–307). Regardless of whether a researcher intends to hide his
or her methods, our current peer review process is not able to identify covert misconduct in
data analysis (e.g., inappropriately trimming data to create a better chance of significant findings
without reporting it; Sterba, 2006). In sum, because there are disagreements concerning which
types of data analytic procedures are appropriate, suggestions for proper data analysis are often
ETHICAL GUIDELINES FOR CONDUCTING DATA ANALYSIS 5

inaccessible, and covert misuses of data are difficult to detect, it is clear that more guidance is
needed regarding ethical practices in data analysis.
Several attempts to provide ethical guidelines for data analysis have been made. The
American Statistical Association (ASA) has developed the “Ethical Guidelines for Statistical
Practice” (Committee on Professional Ethics, 1999). More recently, quantitative psychologists
have published an in-depth reference for ethical issues in their field: the Handbook of Ethics
in Quantitative Methodology (Panter & Sterba, 2011). However, the “Ethical Guidelines for
Statistical Practice” was created specifically for statisticians. Thus, the language and general
guidelines may be unclear or irrelevant for most psychologists. The Handbook of Ethics in
Quantitative Methodology is intended for psychologists and other social science researchers and
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

is a detailed account of ethical issues regarding specific statistical procedures. Still, the hand-
book does not include clear, overarching ethical guidelines for data analyses. Although these
volumes contain many important ideas for the ethical conduct of data analysis, they do not pro-
vide a clear set of ethical guidelines that are generally applicable to researchers in psychology.
Graham (2001) presented a useful set of statistical guidelines for psychologists. However, this
paper was presented at the 2001 APA Division 17 meeting in Houston but was never published
in a more accessible psychology journal. In addition, there has been much discussion of ethical
issues in data analysis and several changes to the APA Ethics Code since this presentation. The
current article is designed to provide psychology researchers with a framework for considering
ethical issues in data analysis (e.g., Gardenier & Resnik, 2002). In addition, this author aims to
incorporate the most recent literature to build upon the guidelines offered by Graham (2001).

AN ETHICAL FRAMEWORK: LOOKING TO THE APA ETHICS CODE

The current APA “Ethical Principles of Psychologists and Code of Conduct” is divided into two
parts: general principles and more specific standards. The general principles “are not themselves
enforceable rules, (although) they should be considered by psychologists in arriving at an ethical
course of action” (APA, 2010). The first three principles are briefly discussed to provide an ethical
framework for issues in data analysis. The final two principles are addressed later in this article
in regard to more specific concerns.
Principle A: Beneficence and Nonmaleficence states that “psychologists strive to benefit those
with whom they work and take care to do no harm” (APA, 2010). Publishing inaccurate or
misleading information due to misuse of data analysis is certainly not beneficial to the profes-
sion, society, or individuals and may cause harm if incorrect information is used to develop
interventions or make policy decisions. As Panter and Sterba (2011) eloquently stated, “Design
and analysis decisions have ethical implications when they indirectly affect human welfare . . .
through their influence on how federal grants are allocated, what future research is pursued, and
what policies or treatments are adopted” (p. 5). Thus, by considering potential consequences of
their work scientists may engage in more ethical research.
Principle B: Fidelity and Responsibility states, “Psychologists establish relationships of trust
with those with whom they work. They are aware of their professional and scientific responsi-
bilities to society and to the specific communities in which they work” (APA, 2010). There is an
implicit level of trust between researchers and consumers of their work, whereby consumers
6 WASSERMAN

assume that the researcher is honest in his or her methods and that the findings are sound.
Kraut (2011) described the recent case of Diederik Stapel’s data fabrication as “harmful to the
scientific enterprise” (p. 1). Granted, this is an extreme case of flagrant research misconduct
(falsification/fabrication of data). Still, any measure of dishonesty in research may tarnish the
trust between researcher and reader. Therefore, researchers may behave more ethically if they
consider their responsibility to maintain a professional relationship with society.
Principle C: Integrity states, “Psychologists seek to promote accuracy, honesty, and truthful-
ness in the science, teaching, and practice of psychology” (APA, 2010). Simmons and colleagues
(2011) demonstrated the ease with which a researcher may report an inaccurate finding, due to
ambiguous data analytic procedures. These authors proposed that it is unlikely that researchers
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

intend to deceive their consumers, but they may allow their desire to produce publishable results
to influence their decisions about how to analyze their data. Thus, psychologists who, above all
else, aim to “discover and disseminate truth” (Simmons et al., 2011, p. 1365), are more likely to
adhere to this ethical principle.
As previously mentioned, the ethical principles are meant not as enforceable regulations but
rather aspirational goals. As such, they inherently do not provide specific guidance for resolving
ethical issues in data analysis. The following includes a review of specific ethical issues in data
analysis and recommendations for ethically responsible research.

ISSUES AND GUIDELINES FOR THE ETHICAL CONDUCT OF DATA ANALYSES

Medin (2011) argued that a discussion of ethical issues is not enough to foster ethical actions,
but instead “psychological scientists might want to . . . see if misconduct can be reduced by
teaching ethical practices” (para. 11). The following guidelines aim to provide a discussion of
specific ethical challenges and recommendations for ethical practice in data analysis. They are
compiled from other researchers’ suggestions (e.g., Graham, 2001; Panter & Sterba, 2011) and
current ethical codes of the APA and ASA. Although these guidelines certainly do not address
every issue, they are intended to provide a basis for further thought and discussion of ethical
issues faced by researchers who are conducting data analyses. Keith-Spiegel, Sieber, and Koocher
(2010) listed several areas of “irresponsible or unethical acts” in research that do not include FFP
(p. 7). The following discussion and proposed guidelines are organized by these categories (see
Table 1 for a list of guidelines).

Incompetence

Keith-Spiegel and colleagues (2010) suggested that incompetence can lead to inappropriate
statistical analyses when “scientists do not recognize their own inadequacies” (p. 8). Consider
a researcher who is interested in looking at depressive symptoms as a predictor of whether a per-
son attempts suicide. Suppose this researcher uses SPSS to analyze his or her data and enters a
score on a depression scale as the independent variable and whether the patient attempted suicide
as the dependent variable. Although the regression is set up correctly, the researcher still makes
the mistake of running a regular regression rather than a logistic regression. It is possible to run a
regular regression with a dichotomous dependent variable. However, Tabachnik and Fidell (2007)
suggested that, for a dichotomous outcome, a logistic regression is a more appropriate analysis
ETHICAL GUIDELINES FOR CONDUCTING DATA ANALYSIS 7

TABLE 1
Recommendations for Conducting Ethical Data Analyses

1. Researchers should be competent in the methods they use to analyze data; otherwise they should obtain training,
experience, consultation, or supervision when necessary.
2. Researchers should choose a minimally sufficient analysis, based on the research question and assumptions of the
statistical technique. If the computer program does not allow for the type of analysis required, researchers should find
a program that does rather than change their research question.
3. Those performing data analyses should be familiar with the database (i.e., the constructs being studied, how the data
were collected, and how the data were prepared for analysis, including proper exploratory analyses). Also, researchers
should present their data to colleagues and have others proof-read early drafts of their work to uncover possible
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

analytical mistakes.
4. If data analysis is performed by another party, researchers should take care to avoid requests that would likely lead to
a loss of objectivity.
5. Researchers should report any changes made to the raw data and provide a rationale for such changes.
6. Researchers should use data analyses in an attempt to find true observations within a sample, not to bolster
preconceived notions or special interests.
7. Researchers should specify and honestly report whether they use confirmatory or exploratory analyses. For
confirmatory analyses, researchers should attempt to formulate hypotheses before any analyses are run on the data set.
In the case of exploratory analyses without a clear hypothesis (e.g., step-wise regression models or exploratory
analyses), replication is encouraged to validate the findings.
8. Researchers should only delegate work to persons that they expect could complete the analysis competently. When
data analysis is performed by unqualified persons, researchers should provide adequate supervision (e.g. open
discussion of data analytic methods) to ensure that such persons perform these analyses competently.
9. Researchers should strive to create and maintain an ethical research environment, by fostering a community that is
aware of and values ethics in data analysis.

than multiple regression because it will not produce nonsensical predictions (e.g., an individual
with a certain depression score is predicted to commit suicide –12% of the time; p. 437).
Establishing and maintaining competence is addressed in the APA Ethics Code. Standard 2.01c
states that “psychologists planning to provide services, teach, or conduct research involving pop-
ulations, areas, techniques, or technologies new to them undertake relevant education, training,
supervised experience, consultation, or study” and Standard 2.03 states that “psychologists under-
take ongoing efforts to develop and maintain their competence” (APA, 2010). Data analysis is
certainly a “technology” or “technique.” However, it is easy to assume that this standard applies
only to clinical work.
Others have called attention to the issue of competence specifically in data analysis (e.g.,
Gardenier & Resnik, 2002). An APA task force, convened to develop recommendations for
the conduct of appropriate statistical analyses, offered the following recommendation: “Do not
report statistics found on a printout without understanding how they are computed or what they
mean” (Wilkinson & Task Force on Statistical Inference, 1999, p 596). Authors of chapters in
the Handbook of Ethics in Quantitative Methodology similarly suggested that anyone conduct-
ing data analysis should know what pitfalls of statistical techniques to avoid and understand
assumptions of statistical techniques well enough to know when they can or cannot be vio-
lated (Gardenier, 2011; Hubert & Wainer, 2011). Graham (2001) wrote that psychologists should
“recognize the limits of your knowledge . . . and use only those procedures with which you are
competent” (p. 15). Techniques in data analysis are continually changing and improving. Thus,
it is important for anyone using data analytic techniques to be aware of the current methods.
8 WASSERMAN

The ASA suggests that statisticians should “remain current in dynamically evolving statistical
methodology; yesterday’s preferred methods may be barely acceptable today and totally obsolete
tomorrow” (Committee on Professional Ethics, 1999, p. A-3). Thus, to complete data analysis
ethically, researchers should be competent in the methods they use to analyze data; otherwise
they should obtain training, experience, consultation, or supervision when necessary.
Although new and innovative statistical techniques are certainly important to consider, they
are not always the most appropriate methods for every research question. When a researcher
learns about a new technique, he or she may be tempted to design a study that would allow
him or her to use this technique, simply because it is an interesting method. However, statistical
techniques are not designed to inspire research questions and thus should not serve as the basis
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

for a study. Researchers may also use a more advanced technique, in the hopes that it is more
likely to be published. Still, it is more important to choose a technique based on whether it can
answer the research question and whether it is appropriate for the data set. For instance, structural
equation modeling is a sophisticated technique that is used to analyze complex models. However,
a large number of participants is often needed to meet the assumptions of this technique. Thus,
for smaller sample sizes, it may be inappropriate. When a researcher chooses to use a statistical
technique based on its sophistication, he or she runs a greater risk of conducting inappropriate
analyses.
In addition, a researcher should not limit his or her choice of statistical technique by only using
techniques that he or she is familiar with. If researchers limit the types of questions they can ask
based on the types of analyses they know how to run, the breadth of possible research questions
will also be limited. For instance, consider the example of a novice researcher who tells his
mentor that he does not want to conduct a longitudinal study because he does not know how to run
repeated measure analyses. If he has a research question that is best answered with longitudinal
data, then he should gain the appropriate training or supervision necessary to properly answer his
research question.
There are no standards or principles in the APA Ethics Code that directly address this issue.
However, in reference to clinical assessment, the APA Ethics Code states that psychologists
should use techniques “for the purposes that are appropriate in light of the research on or evi-
dence of the usefulness and proper application of the technique” (APA, 2010, Standard 9.02).
It seems that this guideline could also apply to statistical techniques. More directly, the ASA
states that researchers should “address the suitability of the analytic methods and their inherent
assumptions relative to the circumstance of the specific study” and “promote effective and effi-
cient use of statistics by the research team” (Committee on Professional Ethics, 1999, C8 and E2).
In addition, the APA Board of Scientific Affairs argued, “Although complex designs and state-
of-the-art methods are sometimes necessary to address research questions effectively, simpler
classical approaches often can provide elegant and sufficient answers to important questions”
(Wilkinson et al., 1999, p 598) . Peterson (2009) emphasized the importance of using a mini-
mally sufficient analysis and argued that many researchers and journal reviewers do not follow
this advice. Wilkinson and the APA Board of Scientific Affairs (1999) also stated that the limits of
computer programs should not restrict a researchers’ methodology. These guidelines suggest that
researchers should choose a minimally sufficient analysis, based on the research question and
assumptions of the statistical technique. If the computer program does not allow for the type of
analysis required, researchers should find a program that does rather than change their research
question.
ETHICAL GUIDELINES FOR CONDUCTING DATA ANALYSIS 9

Carelessness

Carelessness can occur when a researcher is hurried, distracted, stressed, or simply inattentive
to details (Keith-Spiegel et al., 2010). Gardenier (2003) provided an example of carelessness in
data analysis. He said that “looser honest error” can occur when a knowledgeable statistician
is not diligent enough to fully understand the data that he or she is working with (Gardenier,
2003). This situation can easily occur if a statistician is contracted by a researcher to analyze
his/her data. This situation could also occur if a researcher works with an archival data set. For
example, a researcher might decide to run a multivariate analysis of variance (MANOVA) with
four dependent variables, but unbeknownst to the researcher, one of the variables is actually a
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

composite of the other three. One assumption of the MANOVA is that no dependent variables are
linear combinations of other variables. Thus, even though the researcher is knowledgeable about
the statistical technique, he or she has still violated this assumption. Such a mistake could cause
the overall MANOVA equation to be significant when in fact it would not be if the dependent
variables did not overlap. This type of mistake could lead to a “significant” finding that is not
truly significant (i.e., a false positive).
Although there are no relevant APA ethical standards, the ASA’s ethical guidelines include
a standard that addresses this issue of familiarity with the data set. Guideline A-7 states that
statisticians should know “the theory, the data, and the methods used in each statistical study”
(Committee on Professional Ethics, 1999). This guideline also suggests that statisticians should
ideally be involved from the very beginning of the study, so that they may participate in the
planning stages. If the researchers conducting the data analyses are not familiar with the study
and with the database, then they run the risk of making mistakes due to misunderstanding the data.
Thus, these ethical guidelines suggest that those performing data analyses should be familiar with
the database (i.e., the constructs being studied, how the data were collected, and how the data
were prepared for analysis, including proper exploratory analyses). Also, researchers should
present their data to colleagues and have others proof early drafts of their work to uncover
possible analytical mistakes.

Dishonesty Indirectly Related to Work as a Researcher

Keith-Spiegel and colleagues (2010) suggested that a researcher’s character deficits, poor
management skills, and overextension of one’s resources could lead to “actions that may not
directly involve the actual conduct of research itself (but) can still corrupt science” (p. 9).
At times, researchers may ask other individuals to run the data analysis for them (e.g., if the
researcher is not competent in a data analytic technique). In this situation, it is important to
reflect on how the researcher asks the statistician to conduct such work. If the researcher asks the
statistician to “find something significant that I can present at my next meeting,” this may lead to
a loss of objectivity on the part of the statistician. Instead of conducting data analyses in the most
correct way, the statistician might focus on achieving an end result: a significant finding.
There are a couple of sources that discuss ethical issues involved when work is delegated to
a statistician. Principle E of the APA Ethics Code states, “Psychologists respect . . . rights of
individuals to privacy, confidentiality, and self-determination” (APA, 2010). Thus, an individual
who is asked to complete the data analysis should have the right to use his or her expertise to
10 WASSERMAN

form his or her own opinion of the findings. Keith-Spiegel and colleagues (2010) alluded to this
issue in one example of unethical behavior: “Bullying others into surrendering some rights for
the sake of one’s research” (p 9). The ASA directly addresses this issue by stating that clients
employing statistical consultants should know that statisticians cannot guarantee a finding and
that they should not pressure a statistician to violate his or her own ethical code (Committee on
Professional Ethics, 1999). Therefore, if data analysis is performed by another party, researchers
should take care to avoid requests that would likely lead to a loss of objectivity.
Further, raw data are intended to include only direct observations from the environment that
reflect information about the general population. However, there are times when researchers
decide to modify the raw data for sound reasons. For instance, researchers may change raw
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

data to manage outliers, impute missing data, or adjust skewed data. All of these adjustments
are considered accepted practice in statistics. However, a researcher could easily use one of
these situations to change the data in a way that allows for a greater chance of the desired out-
come. For example, Sterba (2006) noted that statisticians have found that researchers are more
likely to drop outliers if doing so would result in a significant finding (Dunnette, 1965, Kimmel,
1996, and Rosenthal, 1994, as cited in Sterba, 2006). Consider a situation where a researcher
ran the analysis first with the outliers in the data set and then again without the outliers to see
which set of analyses results in a significant finding. Also, suppose that this researcher then
reports and publishes only the analysis that produced a significant result. In this situation, the
researcher is not honestly reporting how the preliminary analyses and data modifications were
conducted.
Unfortunately, there are no specific standards or principles from the APA Ethics Code that
apply to this situation. Still, the ASA addresses issues with raw data in their ethical guide-
lines. First, they suggest that statisticians should “account for all data considered in a study and
explain the sample(s) actually used” (Committee on Professional Ethics, 1999, C-5). The ASA
also suggests that statisticians “report the data cleaning and screening procedures used, includ-
ing any imputation” (Committee on Professional Ethics, 1999, C-7). These guidelines indicate
that researchers should report any outliers that are dropped or subjects that are excluded from
analyses. In addition, these guidelines suggest that researchers should provide a rationale for
splitting their sample before running analyses. For instance, consider a treatment study for which
the researcher decides to only include participants whose depression scores were above a cer-
tain cutoff point prior to treatment. According to this principle, the researcher should account for
all participants for whom he or she collected data and then provide a rationale for only includ-
ing a subsample in the analysis. If the researcher describes the full sample and then conducts
analyses with only some of the participants, then such reported results could be misleading.
In sum, researchers should report any changes made to the raw data and provide a rationale
for such changes.
Moreover, researchers might be tempted to participate in unethical behavior, if doing so would
obtain a desirable result. For example, consider a possible situation where a researcher is eval-
uating his or her own treatment program and is motivated to quickly obtain positive results that
would support the use of this particular treatment. To obtain a positive outcome, the researcher
starts collecting data and reruns analyses every time he or she gets another 10 participants until he
or she finds a significant outcome, at which point the researcher stops recruiting new participants.
Simmons and colleagues (2011) demonstrated that a researcher who engages in this method of
data analysis poses a much greater risk of finding a false-positive result. If a researcher conducts
ETHICAL GUIDELINES FOR CONDUCTING DATA ANALYSIS 11

data analysis with the sole intent to prove his or her belief, then he or she may engage in unethical
behavior to achieve this result.
Although there are no relevant APA ethical standards, Principle D from the APA Ethics Code
states, “Psychologists . . . take precautions to ensure that their potential biases . . . do not lead
to or condone unjust practices” (APA, 2010). In addition, several authors have warned against
confirmation bias in statistical work (e.g., Hubert & Wainer, 2011). Specifically, Rosnow and
Rosenthal (2011) argued that we must accept whatever findings come from proper data analysis.
They stated, “We cannot subsequently decide that we do not like the result we have calculated and
then reperform the estimate on the same sample using a different method in hope of getting a more
pleasing result” (p. 31). Simmons and colleagues (2011) argued that researchers should report all
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

findings in their studies, regardless of whether they are significant. These authors also suggest that
journal reviewers should be more accepting of nonsignificant and “imperfect” results (i.e., those
that do not support the study hypotheses). Confirmation bias is a tendency that is inherent in all
persons, including researchers (Nickerson, 1998). Researchers put much effort into developing
and describing hypotheses, and it is only natural for them to want to find results that support their
predictions. Still, it is important for researchers to be vigilant in guarding against such biases
affecting their data analysis. Thus, researchers should use data analyses in an attempt to find true
observations within a sample, not to bolster preconceived notions or special interests.
Data analysis may include both exploratory and confirmatory methods (Tukey, 1980).
Although psychologists often use confirmatory methods (i.e., hypothesis testing), exploratory
methods (e.g., exploratory factor analysis) and modeling have recently gained popularity
(Rodgers, 2010). Still, there are times when researchers use exploratory methods but report their
findings as confirmatory. Kerr (1998) coined the term HARKing (hypothesizing after the results
are known) and defined the issue as “presenting a post hoc hypothesis (i.e., one based on or
informed by one’s results) in one’s research report as if it were, in fact, an a priori hypothesis”
(p. 196). For example, consider a situation where a researcher has collected data and decides to
run dozens of regressions or correlations to see what variables are significantly associated with
one another, and then develops a hypothesis based on these findings. This method is exploratory,
as the analyses were run without a clear hypothesis in mind. If this researcher reports his or
her hypothesis as if it was created before he or she conducted the research, then he or she is
inaccurately reporting her methods.
Consider another situation where a researcher uses structural equation modeling to analyze his
or her data. With structural equation modeling, a hypothesized model is proposed before analyses
are run. However, many paths in the model must be significant for the model to be a good fit as
a whole. To obtain a good fit, researchers often explore different options by including different
variables in the model, or by evaluating modification indices to find the best fit. Because these
methods are exploratory, the researcher must be careful in concluding the significance of his
or her best-fitting model. The final model may be a good fit for the current sample, but it may
not generalize to the larger population. Thus, replication of this study with another sample is
essential to ensure the most accurate conclusions. After a model is found to work for one sample,
it is important to apply this same model with another sample to ensure that this model is a good
fit for more than just one group of participants.
The APA Ethics Code does not address this issue, but several researchers have raised concerns
about post hoc hypotheses in confirmatory analyses (e.g., Kerr, 1998; Sterba, 2006). Hubert and
Wainer (2011) argued that you cannot examine your data first before determining your hypothesis.
12 WASSERMAN

In addition, the ASA includes a guideline that addresses the use of too many statistical tests in
confirmatory analyses. Running too many statistical tests usually occurs when there is not a clear
a priori hypothesis. The guideline states,
Running multiple tests on the same data set at the same stage of an analysis increases the chances
of obtaining at least one invalid result. Selecting the one “significant” result from a multiplicity of
parallel tests poses a grave risk of an incorrect conclusion. Failure to disclose the full extent of tests
and their results in such a case would be highly misleading. (Committee on Professional Ethics,
1999, A-8)

As well, quantitative psychologists argue that data analytic methods for exploratory analyses,
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

like forward or stepwise regression equations, “are some of the most misused statistical meth-
ods available in all common software packages” (Carrig & Hoyle, 2011, p. 99). Most concerns
about using exploratory methods involve failing to use confirmatory methods to replicate the find-
ings and misreporting the methods used. Thus, researchers need to be clear as to whether they
are using exploratory or confirmatory analyses. Researchers should specify and honestly report
whether they use confirmatory or exploratory analyses. For confirmatory analyses, researchers
should attempt to formulate hypotheses before any analyses are run on the data set. In the case of
exploratory analyses without a clear hypothesis (e.g., step-wise regression models or exploratory
analyses), replication is encouraged to validate the findings.

Inadequate Supervision of Research Assistants

“Assistants can become source of considerable purposeful or unintentional error, especially when
the monitoring of their work is inadequate” (Keith-Spiegel et al., 2010, p. 8). A researcher may
have a more novice student or research assistant run statistical analyses to provide him or her with
a learning opportunity. Although this is an integral part of training, it is important to recognize the
vulnerable position in which the student is placed. First, there is a power differential between the
student and the researcher. Thus, any unethical behaviors conducted by the researcher might be
passed down to the student, as he or she may not have the experience or training to know what is
correct. In addition, it may be difficult for a student to raise ethical concerns to his or her mentor
because the mentor is in a higher position of power. Keith-Spiegel and colleagues (2011) also
suggested that a student may not possess the competence to complete the data analysis but may
be hesitant to admit their inexperience because he or she wishes to appear knowledgeable and
competent to the supervisor.
The importance of supervision is certainly addressed in the literature. Standard 2.05 of the
APA Ethics Code discusses the delegation of work to others in the context of maintaining com-
petence, such that work should be delegated only to those who are expected to complete the work
competently (APA, 2010). Although this standard is very similar to the proposed guideline, it
does not specifically address delegation of data analysis. In relation to research integrity, Keith-
Spiegel and colleagues (2011) stated that “carefully monitoring students and other assistants is a
supervisor’s legitimate duty and responsibility” (p. 8). In addition, a document published by the
Department of Health and Human Services states that research mentors are ultimately responsi-
ble for training and supervising the trainee (Steneck, 2007). These sources of guidance suggest
that researchers should only delegate work to persons that they expect could complete the anal-
ysis competently. When data analysis is performed by unqualified persons, researchers should
ETHICAL GUIDELINES FOR CONDUCTING DATA ANALYSIS 13

provide adequate supervision (e.g., open discussion of data analytic methods) to ensure that such
persons perform these analyses competently.

Difficult or Stressful Work Environment

Keith-Spiegel and colleagues (2010) noted that researchers may experience stress and perform
more poorly due to working in a difficult environment (e.g., conflicts among administrators, toxic
mentoring, mistreatment of subordinates, etc.; p. 9). Gardenier (2011) suggested that an indi-
vidual’s morals may not be enough to prevent misconduct if the individual is practicing in an
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

environment that does not promote professionalism and ethical conduct. For example, consider
a situation where a researcher agrees to collaborate on a project and discovers that the principal
investigator (PI) is engaging in some unethical data analytic techniques. Ideally, the researcher
would express his or her concern to the PI, the PI would consider the researcher’s concerns, and
the PI would correct the behavior and proceed in an ethical manner. However, this is not always
the case. In a survey of 3,247 scientists, 12.5% (405 scientists) reported that they have overlooked
“others’ use of flawed data or questionable interpretation of data” (Martinson, Anderson, & de
Vries, 2005). If researchers do not confront others when unethical conduct is suspected, then they
are contributing to an environment that allows for this type of misbehavior.
However, confronting another researcher, especially a more advanced researcher, is difficult
to do. In the preceding example, the researcher may be concerned that confronting the PI would
strain their relationship and decrease the chance of collaboration in the future. It is also possible
that the PI is in a position of power over the researcher (e.g., the PI is on the researcher’s promo-
tion and tenure committee) and thus the researcher may fear retaliation from the PI. In addition,
if the researcher is in an environment that solely values productivity, then he or she may be less
likely to raise his or her concern, as it might thwart his or her opportunity to obtain a publication.
Yet if the researcher works in an environment that encourages and supports ethical behavior above
and beyond productivity, then there is a greater possibility that he or she would raise the ethical
issue to the PI. Thus, it is crucial that members, and particularly leaders, of research institutions
communicate the importance and value of ethical behavior.
The APA Ethics Code addresses the importance of resolving ethical issues. Standards 1.04 and
1.05 state that psychologists should report ethical violations but first try to resolve them with the
individual if possible (APA, 2010). Although these standards are not presented in the context of
research ethics, they are still applicable to data analysis. In fact, the ASA directly states that statis-
ticians should “support sound statistical analysis and expose incompetent or corrupt statistical
practice” (Committee on Professional Ethics, 1999, H-4). The ASA also suggests that statisticians
should first address an ethical conflict in private, but if that is not effective, then the unethi-
cal behavior should be reported to the ethics boards of the researchers’ institution (Committee
on Professional Ethics, 1999). Clearly, both the APA and the ASA share a common stance on
addressing ethical conflicts with others. In fact, Koocher and Keith-Spiegel (2010) found evi-
dence to support the effectiveness of peer confrontation in addressing minor research misconduct.
From this research, Keith-Spiegel and colleagues (2010) developed a manual to guide researchers
through peer confrontation. From these discussions, it is evident that researchers involved in data
analysis are expected to maintain an ethical environment. Gardenier (2003) aptly advised, “Do
not accept inferior statistical work from colleagues, publications, or ourselves. Never be afraid or
14 WASSERMAN

embarrassed to ask for statistical help when needed” (p. 2). In sum, researchers should strive to
create and maintain an ethical research environment by fostering a community that is aware of
and values ethics in data analysis.

CONCLUSIONS

Ethical issues arise in all aspects of practice in psychology, including data analysis. Ethical
guidelines currently exist for many areas of practice to guide psychologists through these issues.
However, there are no specific guidelines for psychologists who are conducting data analyses. The
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

ethical guidelines presented in this article are intended to inspire psychologists to think of their
actions in research as influencing more than just the scientific community. Researchers should
realize that their work influences individuals and recognize the responsibility that they have to
ensure that their work is statistically and ethically sound. Although further discussion of eth-
ical issues in data analysis is certainly warranted, psychologists at all levels of the profession
can participate in promoting ethical data analytic techniques by following these guidelines and
contributing to an environment that values ethically responsible research.

ACKNOWLEDGMENTS

I wish to acknowledge the helpful comments and suggestions made by Grayson Holmbeck, Ph.D.,
and Patricia Rupert, Ph.D.

REFERENCES

American Psychological Association. (2010). Ethical principles of psychologists and code of conduct. Washington, DC:
Author.
American Psychological Association. (2012). About APA. Retrieved from http://www.apa.org/about/index.aspx
Carrig, M. M., & Hoyle, R. H. (2011). Measurement choices: Reliability, validity, and generalizability. In A. T. Panter &
S. K. Sterba (Eds.), Handbook of ethics in quantitative methodology (pp. 127–158). New York, NY: Routledge
Academic.
Committee on Professional Ethics. (1999). Ethical guidelines for statistical practice. Retrieved from http://www.amstat.
org/about/ethicalguidelines.cfm
De Vries, R., Anderson, M. S., & Martinson, B. C. (2006). Normal misbehavior: Scientists talk about the ethics of
research. Journal of Empirical Research on Human Research Ethics, 1, 43–50.
Gardenier, J. S. (2003). Best statistical practices to promote research integrity. Professional Ethics Report, 16, 1–3.
Gardenier, J. S. (2011). Ethics in quantitative professional practice. In A. T. Panter & S. K. Sterba (Eds.), Handbook of
ethics in quantitative methodology (pp. 15–36). New York, NY: Routledge Academic.
Gardenier, J. S., & Resnik, D. B. (2002). The misuse of statistics: Concepts, tools, and research agenda. Accountability
in Research, 9, 64–74.
Graham, J. M. (2001, March). The ethical use of statistical analyses in psychological research. Paper presented at the
annual meeting of Division 17 (Counseling Psychology) of the American Psychological Association, Houston, TX.
Paper retrieved from http://myweb.facstaff.wwu.edu/~graham7/articles/Div17StatEthics.pdf
Grisso, T., Baldwin, E., Blanck, P. D., Rotheram-Borus, M. J., Schooler, N. R., & Thompson, T. (1991). Standards in
research: APA’s mechanism for monitoring the challenges. American Psychologist, 46, 758–766.
Hubert, L., & Wainer, H. (2011). A statistical guide for the ethically perplexed. In A. T. Panter & S. K. Sterba (Eds.),
Handbook of ethics in quantitative methodology (pp. 61–126). New York, NY: Routledge Academic.
ETHICAL GUIDELINES FOR CONDUCTING DATA ANALYSIS 15

John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with
incentives for truth-telling. Psychological Science, 23, 524–532.
Keith-Spiegel, P., Sieber, J. E., & Koocher, G. P. (2010). Responding to research wrongdoing: A user-friendly guide.
Retrieved from http://www.ethicsresearch.com
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2,
196–217.
Koocher, G., & Keith-Spiegel, P. (2010). Opinion: Peers nip misconduct in the bud. Nature, 466, 438–440.
Koocher, G. P., Keith-Spiegel, P., Tabachnick, B. G., Sieber, J. E., & Butler, D. L. (2010). How do researchers respond
to perceived scientific wrongdoing? Overview, method and survey results [Supplemental material]. Nature, 466,
438–440.
Kraut, A. (2011, December). Despite occasional scandals, science can police itself. The Chronicle of Higher Education.
Retrieved from http://chronicle.com/section/Home/5
Downloaded by [Kungliga Tekniska Hogskola] at 09:19 04 February 2015

Martinson, B. C., Anderson, M. S., & de Vries, R. (2005). Scientists behaving badly. Nature, 435, 737–738.
Medin, D. L. (2011, November). A science we can believe in. Observer 24(10). Retrieved from http://www.
psychologicalscience.org/index.php/publications/observer
Nickerson, R. S. (1998). Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology, 2,
175–220.
Panter, A. T., & Sterba, S. K. (Eds.). (2011). Handbook of ethics in quantitative methodology. New York, NY: Routledge
Academic.
Parkinson, S. (1994). Commentary: Scientific or ethical quality? Psychological Science, 5, 137–138.
Peterson, C. (2009). Minimally sufficient research. Perspectives on Psychological Science, 4, 7–9.
Pomerantz, J. R. (1994). Commentary: On criteria for ethics in science: Commentary on Rosenthal. Psychological
Science, 5, 135–136.
Resnick, D. B. (2011). What is ethics in research and why is it important? Retrieved from the National Institute of
Environmental Health Sciences–National Institutes of Health web site: http://www.niehs.nih.gov/research/resources/
bioethics/whatis
Rodgers, J. L. (2010). The epistemology of mathematical and statistical modeling: A quiet methodological revolution.
American Psychologist, 65, 1, 1–12.
Rosenthal, R. (1994). Science and ethics in conducting, analyzing, and reporting psychological research. Psychological
Science, 5, 127–134.
Rosnow, R. L., & Rosenthal, R. (2011). Ethical principles in data analysis: An overview. In A. T. Panter & S. K. Sterba
(Eds.), Handbook of ethics in quantitative methodology (pp. 37–60). New York, NY: Routledge Academic.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data
collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366.
Steneck, N. H. (2007). ORI introduction to the responsible conduct of research. Washington, DC: Department of Health
and Human Services.
Sterba, S. K. (2006). Misconduct in the analysis and reporting of data: Bridging methodological and ethical agendas for
change. Ethics & Behavior, 16, 305–318.
Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics (5th ed.). Boston, MA: Pearson/Allyn & Bacon.
Tukey, J. W. (1980). We need both exploratory and confirmatory. American Statistician, 34, 23–25.
Wilkinson, L., & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and
explanations. American Psychologist, 54, 594–604.

You might also like