Developing and Using A Codebook For The Analysis of Interview Data
Developing and Using A Codebook For The Analysis of Interview Data
Methods
http://fmx.sagepub.com/
Published by:
http://www.sagepublications.com
Additional services and information for Field Methods can be found at:
Email Alerts: http://fmx.sagepub.com/cgi/alerts
Subscriptions: http://fmx.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://fmx.sagepub.com/content/23/2/136.refs.html
Developing and
Using a Codebook
for the Analysis of
Interview Data: An
Example from a
Professional
Development
Research Project
Field Methods
23(2) 136-155
The Author(s) 2011
Reprints and permission:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/1525822X10388468
http://fm.sagepub.com
Jessica T. DeCuir-Gunby1,
Patricia L. Marshall1, and Allison W. McCulloch2
Abstract
This article gives specific steps on how to create a codebook for coding
interview data. The authors examine the development of theory- and
data-driven codes through the discussion of a professional development
(PD) research project. They also discuss how to train others to code using
the codebook, including how to establish reliability. The authors end with
practical suggestions from their experiences in creating a codebook.
DeCuir-Gunby et al.
137
Keywords
codebook, coding, interviews, team-based research
Analyzing interview data is a multistep sense-making endeavor. To
make sense of interviews, researchers must engage in the process of coding
data. Although coding interviews is widely recognized as a common step in
the interview analysis process, many researchers do not fully explicate how
this is done. In addition, experts in qualitative methodology have not
established a universally agreed on set of coding procedures that can be
easily replicated (Coffey and Atkinson 1996). Because of this, many
novice researchers are not certain of what procedures to use in coding
interview data or how to begin using such procedures. Qualitative
researchers often discuss the use of a codebook as one of the initial, and
arguably the most critical, steps in the interview analysis process (Fereday
and Muir-Cochrane 2006). Many articles and book chapters describe and
demonstrate the different steps involved in the codebook development
process (e.g., MacQueen et al. 1998; Ryan and Bernard 2000; Franklin and
Ballan 2001; Fonteyn et al. 2008; MacQueen et al. 2008; Laditka et al.
2009; Bernard and Ryan 2010).
The goal of this article is to continue this conversation by showing how
to create and use a codebook as a means of analyzing interview data, using
real-world education data. We begin with a basic discussion of codes,
codebooks, and coding, followed by a description of our professional
development (PD) research project. Using our research project as a reallife example, we demonstrate how to create a codebook by discussing the
development of both theory- and data-driven codes. Additionally, we
address training others to use the codebook and establishing interrater
reliability. We conclude with practical suggestions about the process of
creating a codebook.
138
projects research goals and questions (structural), with most codes being
theory- or data-driven (Ryan and Bernard 2003). The development of
theory-driven codes typically requires constant revisiting of theory,
whereas data-driven and structural codes necessitate repeated examination
of the raw data. Thus, code development is an iterative process.
A codebook is a set of codes, definitions, and examples used as a guide to
help analyze interview data. Codebooks are essential to analyzing qualitative research because they provide a formalized operationalization of the
codes (MacQueen et al. 1998; Crabtree and Miller 1999; Fereday and
Muir-Cochrane 2006; Fonteyn et al. 2008). Even so, like codes, codebooks
are developed through an iterative process that may necessitate revising
definitions as the researchers gain clearer insights about the interview data.
The more specificity in a codebook, the easier it is for coders to distinguish
between codes and to determine examples from nonexamples of individual
codes. In addition, the more detailed the codebook, the more consistency
there will be among coders when using it to code interviews. Thus, Macqueen
et al. (1998) suggest that the structure of codebooks should consist of six
components, including the code name/label, brief definition, full definition,
inclusion criteria, exclusion criteria, and examples. However, in this case, we
have chosen to structure our codebook using three components: code name/
label, full definition (an extensive definition that collapses inclusion and
exclusion criteria), and an example.
The actual process of coding is an integral part of the interview data analysis process. Coding is the assigning of codes (that have been previously
defined or operationalized in a codebook) to raw data. This allows researchers
to engage in data reduction and simplification. It also allows for data expansion
(making new connections between concepts), transformation (converting data
into meaningful units), and reconceptualization (rethinking theoretical associations; Coffey and Atkinson 1996). Further, through coding, researchers make
connections between ideas and concepts. Applying codes to raw data enables
the researcher to begin examining how their data supports or contradicts the
theory that is guiding their research as well as enhances the current research
literature. Coding is, in essence, a circular process in that the researcher may
then revisit the raw data based upon theoretical findings and the current
research literature. See Figure 1 for a visualization of the coding process.
According to Corbin and Strauss (2008), there are two major levels of
codingopen coding and axial coding. When beginning to code interview
data, the first step is to engage in the process of open coding or breaking
data apart and delineating concepts to stand for blocks of raw data (Corbin
and Strauss 2008:195). Open coding allows for exploration of the ideas and
DeCuir-Gunby et al.
139
research
literature
code
development
theory
coding
raw data
meaning that are contained in raw data. While engaging in open coding, the
researcher creates codes or concepts. Once codes have been created using
open coding, it is necessary to analyze them through the process of axial
coding. This higher level of coding enables researchers to identify any connections that may exist between codes.
When beginning the analysis process, inexperienced qualitative
researchers are likely to have many questions, including the central question: How do I create a codebook? However, another question they
should ask is, What role does theory play in the creation of a codebook?
Similarly, once a codebook has been created, they may discover they need
to ask, How do I train others to use a codebook? Questions such as these
can frustrate and stymie the efforts of beginning researchers. Therefore, for
the remainder of this article, we respond to these questions and describe
how we created a codebook for analyzing interview data as part of our multiyear funded research project.
140
and second grade teachers and their students. Briefly, this project explored
how teachers understand and adopt standards-based teaching practices
(National Council of Teachers of Mathematics 2000) that promote young
childrens conceptual understanding in early mathematics. Enrollment in
the project occurred at the start of each of three consecutive academic
years (20052008), and teachers were required to attend our extensive
PD intervention that was organized as four 2-day retreats (approximately
90 hours) spread over the course of each year the individual teacher participated in the project.2
In addition to participation in the PD retreats, teachers were required to
complete the project instruments (e.g., Teacher Dispositions Questionnaire), be videotaped (by a project research assistant [RA]) teaching eight
different mathematics lessons occurring at preselected intervals throughout
the academic year, observe a same grade peer (also participating in the project) teaching mathematics lessons, and participate with that same grade peer
in [post-teaching] reflection sessions led by an RA. Finally, each project
teacher was required to participate in a one-on-one interview at the start and
conclusion of each project year. Teachers were paid a $1,000 stipend for
each year they participated in the study.
The PD component of NMD was designed to facilitate teachers critical
understandings of the impact of culture on the teachinglearning process.
To this end, we attempted to promote understanding of cultural relevance
(Ladson-Billings 1994) as a pedagogical orientation as well as incorporation of its broad tenets into mathematics teaching and post-instructional
reflections. The goal was for teachers to adopt culturally relevant pedagogy
as part of their professional identities, which we theorized would, in turn,
impact their orientations toward the issue of equity not only in their mathematics instruction but in their approach to the teachinglearning process in
general. In addition, a goal was to promote deep mathematical understanding by analyzing how students outside of school experiences with mathematics impacted their formal conceptions. This part of our study relates
to the notion of conception-based perspective in mathematics teaching.
A conception-based perspective characterizes teachers who operate from
the assumption that a students mathematical reality is not independent of
that students ways of knowing and acting, that what a student sees, understands, and learns is constrained and afforded by what that student already
knows, and that mathematical learning is a process of transformation of
ones knowing and ways of acting (Simon et al. 2000).
Although the corpus of data for our project was generated from a
variety of sources, this discussion focuses exclusively on the various steps
DeCuir-Gunby et al.
141
Creating a Codebook
As previously mentioned, codes are created from three major areas including theory (theory-driven), data (data-driven), and research goals (structural). In the case of NMD, only theory- and data-driven codes were
created to assist in the coding of interviews. Boyatzis (1998) indicates that
there are separate procedures for creating theory- and data-driven codes.
Developing theory-driven codes involve three steps: (1) generate the code;
(2) review and revise the code in context of the data; and (3) determine the
reliability of coders and the code. Data-driven codes, on the other hand,
involve five steps to inductively create codes for a codebook: (1) reduce raw
information; (2) identify subsample themes; (3) compare themes across
subsamples; (4) create codes; and (5) determine reliability of codes. We will
use Boyatziss framework to demonstrate the steps we used to create theory- and data-driven codes and codebook definitions. (See Figure 2 for a
visual of the steps for creating a codebook.)
142
Theory-Driven
Codes
Data-Driven
Codes
Identify Subsample
Themes
Review & Revise
Codes within
Context of Data
Compare Themes
Across Subsamples
Create Codes
Establish
Reliability
explanations and responses to the various questions that arose about the
different aspects of the process. This lessened the possibility of groupthink
in that we discussed each topic until we had a consensus.
Our conversations explored the relationships between culturally relevant
pedagogy and a conception-based perspective and how these relationships
can be captured through codes. It was difficult to reduce intricate theories
such as culturally relevant pedagogy and a conception-based perspective
to a few words because, in many cases, we had to think about how to best
operationalize abstract concepts. Thus, it must be stated that creating
DeCuir-Gunby et al.
143
144
The last step in developing theory-driven codes is determining reliability, including discussing utility and implementation. (An in-depth
discussion on training others to code that includes establishing reliability
is provided later in this article.) The PIs individually practiced coding
interviews with the theory-driven codes and met to discuss individual
findings. We discovered that coding individually resulted in multiple
interpretations for the data and revealed inconsistencies in the coding
protocol. We changed to coding as a group, which allowed us to move
forward by sharing our reasons for utilizing codes in particular ways.
In addition, coding as a group also afforded us the opportunity to explore
examples and non-examples of the codes. These in-depth conversations
were very enriching, allowing us to reach consensus on the coding procedure protocol. See Table 1 for a sample of final theory-driven codes, definitions, and examples.
DeCuir-Gunby et al.
145
Description
Example
Conceptionbased
reference
Cultural
referencing
Teacher makes
direct/indirect reference to
specific elements of
students culture/
background (e.g., race,
socioeconomic status,
language, other outside of
school experiences, etc.)
that may impact the
teachinglearning process
Teacher describes or gives
examples of what she
believes characterizes
procedural understanding.
Procedural
understanding
description
discussed the possibility of coding line by line, on the sentence level, on the
paragraph level, or by what we labeled the level of meaning. After reading
several interviews, we realized that coding line by line and on the sentence
level were often not meaningful. The paragraph level, on the other hand,
often featured a variety of themes, making it impossible to label with only one
code. Based on this, we decided to focus on the level of meaning. From this
perspective, the lumping and splitting of text could occur at different
locations, enabling a code to be made up of a line, sentence, or paragraph,
as long as the essence is the same (MacQueen et al. 2008). However, we
agreed a separate code was warranted when the unit of analysis could stand
on its own and convey meaning outside of the larger context of the interview. This same rule applied to the implementation of theory-driven codes.
146
DeCuir-Gunby et al.
147
Description
Example
Other influences
on teachers
Curricular
references
Pedagogical
struggles
procedures that were used for the theory-driven codes, including practicing
coding individually and as PIs to synchronize our orientations to the process. In addition, we discussed examples and non-examples of the codes.
Finally, we reexamined the data-driven codes in relation to the theorydriven codes to identify and eliminate any overlap. See Table 2 for a sample
of data-driven codes.
148
overview of the process, informing them that their input would be unique,
and therefore truly valued. Further, we informed them that everyones
(including the PIs) interpretations of codes, as well as everyones application of codes to any given data, could potentially be questioned and thereby
subjected to critical analysis by any other member of the research team.
As co-PIs, we modeled this process of questioning each others interpretations by making the RAs privy to our initial code development process,
including disagreements that emerged among us as project leaders. Also,
we encouraged the RAs to question each others interpretations and applications of the codes. The RAs practiced and honed their coding skills using
interviews of various lengths. Similar to the process used by the PIs, the
RAs coded the interviews individually as well as collectively and shared
their thinking behind their coding as a group.
This entire process involved 2-hour weekly meetings over the course of
3 months, for a total of 24 hours. During this time, we specifically focused
on code names, definitions, examples, and non-examples. Code definitions
were written in simple, straightforward language. When we used project
vernacular, we invited RAs to pose questions about meaning and, where
appropriate, we substituted terminology for simpler, easier language.
We also discussed the coding process, including how to determine when
a code begins, when it ends, and the possibility of multiple coding, applying two or more codes to the same text (Bogdan and Biklen 2003). For our
project, RAs were informed there would be situations when two or more
codes were applicable to a specific segment of text; however, they were
cautioned to use multiple coding sparingly and that certain codes would
more likely to be used for multiple coding. For example, the code teaching
strategies often accompanied the code perception-based referencing
because for teachers to describe their general beliefs regarding how students learn (perception based), they often provided examples of what they
did in class (teaching strategies).
In addition to learning how to code using the codebook, we also
focused on the use of Computer Assisted Qualitative Data Analysis Software (CAQDAS). The entire research team attended structured sessions
on how to store, code, and retrieve data using Atlas.ti, a type of CAQDAS. A colleague with proficiency in using Atlas.ti conducted two training sessions. The first session focused on the fundamentals of using
Atlas.ti; the second session concentrated on coding using the project data.
At these sessions, each PI posed questions and raised issues that could
potentially emerge as the RAs were assigned to use the software in actual
interview coding.
DeCuir-Gunby et al.
149
150
raters can have little to no agreement on particular items but still obtain a
relatively high correlation. Also, with Cohens/Fleisss kappa coefficients,
the percentage of agreement decreases with both the number of coders and
categories added (Krippendorff 2004). Given the size of our research team
and the number of categories (codes), the probability that we would get a
high percentage agreement was very low.
Because we wanted to focus on both the quantity of codes (what) and the
quality of codes (how), we extended the Miles and Hubermans (1994)
approach and created our own process, which focused more on group consensus (Harry et al. 2005). This required us to code hard copies and calculate reliability by hand. (We didnt begin using the CAQDAS to code until
the codebook was complete.) In doing so, we acknowledge that utilizing a
statistical calculation such as Cohen/Fleisss kappa or Krippendorffs alpha
would have allowed for a more robust calculation of reliability, thus adding
credibility to our findings.
In engaging in our process, we first decided to consider the consistency
of labeling text with each code. The RAs coded several pages of an interview at a time, followed by a discussion of when and how specific codes had
been applied. Codes that were applied by all RAs with no variations were
considered to be 100% agreement among the RAs. After we determined the
codes that were more easily and consistently identifiable (e.g., NMD referencing, NMD Buddy/Reflection, and Curricular referencing), we then
honed in on the codes that were being applied less consistently. For example, the code teacher identity proved to be problematic. Teacher identity
was used to capture an individual teachers description of how she sees herself professionally, culturally, and/or mathematically, including references
to experiences that she acknowledges as having influenced her sense of self
in either of these realms. After careful deliberation, we decided the teacher
identity code was developed to answer the central question, Who am I?
rather than a general awareness. This led us to redefine the teacher identity
code. Similarly, we discussed and redefined all other problematic codes.
Next, we practiced coding using the new definitions of the codes until the
RAs had 100% agreement. Finally, we engaged in our process of checking
reliability at the beginning and at least one time during the data analysis
process to make sure that coding remained consistent.
Conclusion
Developing a codebook is a challenging process, and for our team, the entire
process, including code creation and coder training, took over a semester to
DeCuir-Gunby et al.
151
accomplish. Our final codebook consisted of 18 codes, including 10 theorydriven and 8 data-driven codes. Based on our experience, we have several
suggestions to offer other researchers who may embark on such a task.
1.
2.
Creating a codebook should be a team effort. The process of creating a codebook is complex and tedious, and, because of all the
various components, it can easily become an overwhelming challenge if undertaken by one person. To lessen the challenge, we
highly recommend forming a codebook creation team, the members
of which bring divergent viewpoints and (if possible) varying
degrees of familiarity with the actual research project. Moreover,
we recommend that the team leaders make a deliberate effort to create an atmosphere between and among the members that encourages
and values critical questioning and constructive criticism. Researchers should be careful to formulate a team that strikes the most useful
balance between divergent viewpoints and efficient task completion.
Ultimately, researchers should remember that the more people
involved in the process, the more divergent viewpoints will emerge.
The more viewpoints, the greater the need for reconciliation and the
longer the process.
Developing a codebook is time intensive. Many steps are necessary
to create a codebook and to teach others how to use the codebook, all
of which are time consuming. To reiterate, the PIs engaged in
36 hours to create the codebook and it took 24 hours to train the RAs
on how to use the codebook, for a total of 60 hours of codebook
development and training. Developing a codebook often requires
revisiting codes and reexamining data. Because of this, researchers
have to become comfortable with uncertainty and with the iterativeness of the process.
In addition, the actual coding of text was time intensive. Because of
the complexity of mathematics concepts being addressed, the
lengths of the interviews varied; kindergarten teachers often had
shorter interviews (approximately 30 minutes), whereas the first and
second grade teachers had longer interviews (approximately 4050
minutes). Coding often took 12 times the length of the interview.
With a total of 145 interviews on average lasting 40 minutes each, it
is estimated that the coding of all of the interviews took around 145
193 hours. As illustrated by the previously discussed time commitments, it is important to keep the notion of time in mind when planning your research time line.
152
3.
4.
DeCuir-Gunby et al.
153
Funding
The author(s) disclosed receipt of the following financial support for the research
and/or authorship of this article: The research reported in this article was supported
by a grant from the National Science Foundation (award #0353412).
Notes
1. In the case of grounded theory, codes are also referred to as concepts. For more
information on grounded theory, see Glaser and Strauss (1967) and Corbin and
Strauss (2008).
2. The intervention for each cohort was different each year of participation, with
each cohort experiencing the same first-year intervention, and Cohorts I and II
experiencing the same second-year experience. The third-year experience for
Cohort I teachers involved collaboration with the researchers in delivering the
intervention to teachers in Cohorts II and III.
3. We utilized the interviews of former project participants to practice identifying
and confirming our codes. We are aware that not all projects will have this
option. If this is the case, practice interviews will have to be recoded using the
final codebook.
References
Bernard, H. R., and G. W. Ryan. 2010. Analyzing qualitative data: Systematic
approaches. Thousand Oaks, CA: SAGE.
Bogdan, R., and S. K. Biklen. 2003. Qualitative research for education: An introduction to theory and methods. 4th ed. Boston: Allyn and Bacon.
Boyatzis, R. 1998. Transforming qualitative information: Thematic analysis and
code development. Thousand Oaks, CA: SAGE.
Coffey, A. J., and P. A. Atkinson. 1996. Making sense of qualitative data: Complementary research strategies. Thousand Oaks, CA: SAGE.
Cohen, J. 1960. A coefficient of agreement for nominal scales. Educational and
Psychological Measurement 20:3746.
Corbin, J., and A. L. Strauss. 2008. Basics of qualitative research. 3rd ed. Thousand
Oaks, CA: SAGE.
Crabtree, B. F., and W. L. Miller. 1999. Doing qualitative research. Thousand Oaks,
CA: SAGE.
154
DeCuir-Gunby et al.
155
Ryan, G. W., and H. R. Bernard. 2000. Data management and analysis methods.
In The handbook of qualitative research, 2nd ed., eds. N. K. Denzin and
Y. S. Lincoln, 769802. Thousand Oaks, CA: SAGE.
Ryan, G. W., and H. R. Bernard. 2003. Techniques to identify themes. Field
Methods 15:85109.
Saldana, J. 2009. The coding manual for qualitative researchers. Thousand Oaks,
CA: SAGE.
Simon, M. A., R. Tzur, K. Heinz, M. Kinzel, and M. S. Smith. 2000. Characterizing a
perspective underlying the practice of mathematics teachers in transition. Journal
for Research in Mathematics Education 31:579601.
Wolcott, H. F. 1994. Transforming qualitative data: Description, analysis, and
interpretation. Thousand Oaks, CA: SAGE.