Qualitative Data Analysis
Qualitative Data Analysis
This document was originally part of the manual for The Ethnograph v4. It also exists in the
manual for The Ethnograph v5 as Appendix E.. You can freely download, copy, print and
disseminate this document if 1) the copyright notice is included, 2) you include the entire document,
and 3) you do not alter the document in any way. This permission does not extend to the Manual
itself. Only to this electronic version of Appendix E. The information in this document represents
some, but not all, of the ideas of the developer of The Ethnograph. I reserve the right to revise and
change my ideas as I continue to develop them.
Introduction
This appendix is an essay on the basic processes in qualitative data analysis (QDA). It serves two
purposes. First it offers some insights into the ideas and practices from which The Ethnograph
emerged and continues to evolve. Second, it is also a simple introduction for the newcomer of QDA.
As Figure 1 suggests, the QDA process is not linear. When you do QDA you do not simply Notice,
Collect, and then Think about things, and then write a report. Rather, the process has the following
characteristics:
C Iterative and Progressive: The process is iterative and progressive because it is a
cycle that keeps repeating. For example, when you are thinking about things you
also start noticing new things in the data. You then collect and think about these
new things. In principle the process is an infinite spiral.
C Recursive: The process is recursive because one part can call you back to a
previous part. For example, while you are busy collecting things you might
simultaneously start noticing new things to collect.
C Holographic: The process is holographic in that each step in the process contains
the entire process. For example, when you first notice things you are already
mentally collecting and thinking about those things.
Thus, while there is a simple foundation to QDA, the process of doing qualitative data analysis is
complex. The key is to root yourself in this foundation and the rest will flow from this foundation.
Once you have produced a record, you focus your attention on that record, and notice interesting
things in the record. You do this by reading the record. In fact, you will read your record many
times. As you notice things in the record you name, or "code,” them. You could simply call them
A, B, C, etc., but most likely you will develop a more descriptive naming scheme.
Coding Things
Coding data is a simple process that everyone already knows how to do. For example, when you
read a book, underline or highlight passages, and make margin notes you are “coding” that book.
Coding in QDA is essentially the same thing. For now, this analogy is a good place to start.
As you become more experienced in QDA you learn that QDA “coding” is also more than this.
Further, you will learn the difference between codes as heuristic tools and codes as objective,
transparent representations of facts (Kelle and Seidel, 1995). In this essay I treat codes as heuristic
tools, or tools to facilitate discovery and further investigation of the data. At the end of this chapter I
address the objectivist-heuristic code continuum.
Of course this analogy differs in important ways from the QDA analysis process. For example, in
QDA you don’t always have a final picture of the puzzle’s solution. Also, in QDA the puzzle pieces
are usually not precut. You create the puzzle pieces as you analyze the phenomena. None the less,
the jigsaw puzzle analogy captures some important attributes of the QDA process.
A useful definition of the QDA process, and one that seems to fit well with the jigsaw puzzle
analogy, comes from Jorgensen (1989).
the researcher sorts and sifts them, searching for types, classes, sequences,
processes, patterns or wholes. The aim of this process is to assemble or reconstruct
the data in a meaningful or comprehensible fashion (Jorgensen, 1989: 107).
A similar idea is expressed by Charmaz (1983). For Charmaz, who works in the “grounded theory”
tradition, the disassembling and reassembling occurs through the “coding” process.
Codes serve to summarize, synthesize, and sort many observations made of the
data....coding becomes the fundamental means of developing the
analysis....Researchers use codes to pull together and categorize a series of
otherwise discrete events, statements, and observations which they identify in the
data (Charmaz, 1983: 112).
At first the data may appear to be a mass of confusing, unrelated, accounts. But by
studying and coding (often I code the same materials several times just after
collecting them), the researcher begins to create order (Charmaz, 1983: 114).
A concrete example of this processes occurs in Freidson’s (1975) Doctoring Together. This passage
shows how the process moves back and forth between the noticing and collecting parts of the
process. I have “coded” this example to highlight this movement.
Noticing: ...we had carried out some 200 separate interviews...and had them
transcribed....Each interview was read, and sections of them which seemed to be
distinct incidents, anecdotes, or stated opinions about discrete topics....were then
typed on 5 x 7 McBee-Keysort cards on which were printed general topical
categories to guide coding.
Collecting: Buford Rhea then read all the cards and tentatively classified them into the simple
content categories we had decided upon in advance.
Noticing: He then read them again so as to test, revise, and refine the initial gross
classification....
Noticing: All the cards in that large pack of between 800 and 1,200 were then read one by
one....
Collecting: ...as they were read, the cards were sorted into preliminary topical piles. (Freidson,
1975: 270-271).
A similar process takes place in the analysis of qualitative data. You compare and contrast each of
the things you have noticed in order to discover similarities and differences, build typologies, or find
sequences and patterns. In the process you might also stumble across both wholes and holes in the
data.
Example 1
The first example comes from a description of QDA by Danny Jorgenson (1989). While this
example repeats a previously quoted passage, this time I specifically identify the parts of the quote
that correspond to the parts of the QDA process.
Noticing: Analysis is a breaking up, separating, or disassembling of
research materials into pieces, parts, elements, or units.
Collecting: With facts broken down into manageable pieces, the researcher sorts and
sifts them,
Thinking: searching for types, classes, sequences, processes, patterns, or wholes.
The aim of this process is to assemble or reconstruct the data in
meaningful or comprehensible fashion (Jorgenson, 1989: 107, my
emphasis).
Example 2
Another example comes from a discussion of grounded theory by Corbin and Strauss (1990).
Noticing/
Collecting: Open Coding is the part of analysis that pertains specifically to the
naming and categorizing of phenomena through close examination of the
data. ...During open coding the data are broken down into discrete
parts,
Example 3
A more concrete description of the process is provided by Schneider and Conrad (1983). They
describe the analysis of interviews they had collected in an interview study of epilepsy. In this
example the codes emerged from the data.
Collecting: and cut up and filed the pieces of paper according to the
codes....
Thinking: Fairly early in our project it became apparent that the medical
perspective on epilepsy did very little to describe our respondents'
experience ( Schneider and Conrad, 1983:242, my emphasis).
Example 4
Finally, Spradley (1979) sketches the traditional process of anthropological field work. In this
example, the noticing process is presented both on the general level of gathering data, and on the
particular level of examining the data. “Sorting through field notes” implies noticing something
Qualitative Data Analysis 7
Noticing/
Collecting: The field work period drew to a close and the ethnographer returned home
with notebooks filled with observations and interpretations. Sorting
through field notes in the months that followed....
For example, if you just have the names of streets in a city, you know something about the city. For
example, in some cities many streets are named after presidents. In others many streets are named
after trees. But simply knowing the names of the streets doesn’t necessarily tell you much about the
layout of the city, or how to get around in the city. For this you need a concrete representation of the
streets in relationship to each other. Further, you need to be able to distinguish between
neighborhood streets versus main traffic streets. Similarly, just having a collection of code words, or
collections of coded segments of data, does not tell you everything you want and need to know about
your data, and how the pieces of your data fit together.
My point at the moment is just that this critical micro-level work requires looking at a few
detailed passages, over and over again, doing the dialectic dance between an idea about how
text is organized and a couple of examples, figuring out what I was looking at, how to look
at it, and why (Agar, 1991: 190).
That critical way of seeing, in my experience at least, comes out of numerous cycles through
a little bit of data, massive amounts of thinking about that data, and slippery things like
intuition and serendipity (Agar, 1991: 193).
For that, you need a little bit of data, and a lot of right brain (Agar, 1991: 194).
The question is, how do you come up with that “little bit of data?” Obviously you start by reading
and rereading the data record. In the process you notice a few interesting things. You then collect
one or more of these things and intensively think about them. So you are still within the basic
model. I would like to carry this a step further and claim that when you identify and extract the
8 Qualitative Data Analysis
segment with which you want to work, you are in fact coding the data. The difference is that you are
not intensively coding, nor are you consumed by the sorting and sifting process.
I need to lay out a couple of stretches of transcript on a table so I can look at it all at once.
Then I need to mark different parts in different ways to find the pattern that holds the text
together and ties it to whatever external frame I’m developing. The software problem here
would be simple to solve. You’d need to be able to quickly insert different colored marks of
different kinds at different points so you could see the multiple connections across the text
all at once, sort of a multi-threaded DNA laid on the text so you could look at the patterns
that each thread revealed and then the patterns among the patterns (Agar, 1991:193).
Here Agar is describing a process where you read and notice many things in the data record. Then
you focus your attention on one part of the “coded” data record. This part can be chosen at random
or deliberately.
I would argue that The Ethnograph, at least, does in fact facilitate the two analytic alternatives
proposed by Agar. In fact, The Ethnograph is unique in its ability to approximate Agar’s “multi-
threaded DNA” model.
In regard to the “little bit of data, lot of right brain” strategy, the coding and collecting of segments
of data can provide the foundation for the process of intensive analysis of a small bit of data. For
example, in order to find a piece of data to intensively analyze, Agar is still going through the
process of noticing and collecting a piece of the data. When using QDA software the preliminary
coding, and preliminary sorting and sifting, can generate pieces that become candidates for the
intensive analysis described by Agar. The trick is to avoid intensive coding early in the analytic
process.
But even if you have done intensive coding you can always change the analytical direction, and shift
your attention to a single piece of data for intensive analysis. In short, one approach does not
preclude the other. In fact they can complement each other, and software can facilitate the shift to
and from intensive analysis.
An example comes from my own experience. A group of colleagues and I were analyzing data
collected during a study of interactions between nurses and women during the process of giving
birth. One interesting thing we noticed in our data was that the labor room nurse periodically talked
about making “progress” during the birth process. We collected instances of “progress” talk and
scheduled an analysis meeting on the topic.
Qualitative Data Analysis 9
During an analysis meeting a team member would present one or two data segments. We also had
access to the original coded transcript and the video from which it was transcribed. We would spend
several hours analyzing and thinking about those segments. Each team member had a printout of
the segment and would cover it with notes, thoughts, and scribbles. An example printout is shown in
Figure 2. At the end of the session we would write up a preliminary memo summarizing our work.
During analysis our attention was not restricted to the particular segment. For example, we might
also examine and compare other “progress” segments with the segment we were analyzing. We
would also look at the transcript from which the segment came so that we could place our analysis
in the larger context from which the segment came. This context was not simply the immediately
adjacent text within which the segment was embedded, but the entire event of which it was a part.
This type of analytic process was not focused on gross analysis and summarization of a category of
the data. Rather, it emerged out of preliminary coding and followed Agar’s prescription of working
with “a little bit of data, and a lot of right brain” (Agar, 1991: 194). Sometimes the process took us
beyond the topic of the segment. Sometimes it took us deeper into the topic.
data file, it directly facilitates this type of analysis. But before I address this I will offer two similar
analogies: topographical maps and ad hoc maps.
A topographical map is a way of coding the landscape so that it shows you the physical features of
the landscape. It gives you a very different picture of the physical landscape compared to a standard
road map. It shows you the hills and valleys, forests and clearings, and other features and details of
the landscape in relationship to each other. This makes it easier for you to navigate through
unfamiliar territory, especially off the roads. In a similar manner your “codes” can highlight
features and details of your data landscape.
The display of code words embedded in the data file (which The Ethnograph does) produces
something resembling a “topographical” map of your data. Just as a real topographical map can
help you discover and chart a path through the countryside, your codes can help you discover and
chart patterns through the “landscape” of your data. These patterns are not reducible to code words,
and are not discoverable from a simple examination of collections of coded segments. Yet these
patterns can only be discovered because of the way in which you coded your data.
An ad hoc map is the kind of map that you draw to tell people how to get to your house. When you
draw this map you highlight (i.e., code) certain features of the landscape as reference points. For
example, you might emphasize major intersections, stop lights, stores, etc. There are many
intersections, stop lights, and stores in the area, but a particular combination of them mark the path
to your house. In order to draw the map you have to know some general things about intersections,
stop lights, and stores. But this general knowledge does not reveal the path to your house. Knowing
and describing the path requires a knowledge of specific intersections, stoplights, and stores. Thus,
describing the path depends on coded features of the landscape, but the path is not reducible to the
coded features, nor is it revealed by studying collections of those features of the landscape.
A practical example from my own work illustrates the “threading” or “map” metaphors. It comes
from another analysis session on the second stage labor project. One day my colleagues and I
focused on a data fragment where the nurse displayed the “three push rule” to the laboring woman.
The plan was to intensively analyze this small piece of data.
While two of us attended to this data fragment the third member of the team drifted away from the
discussion and started looking at the fully coded transcript from which the segment came. She
noticed the co-occurrence of a “praise” utterance with the “three push rule” display. She also noticed
that this came at the end of a uterine contraction. Going back through many pages of the transcript
she noticed the absence of a “praise” utterance during all the previous contractions. After she
brought this to our attention we followed the patterns backwards to another pivotal event, and then
forward to the “three push rule” display. In this way we discovered a new phenomenon, a “progress
crisis,” which cut across, and transcended, the coded segments on the transcript.
This discovery depended on the fact that we had coded our transcript in a particular way, but the
discovery was not reducible to the codes, nor could it have been derived by simply inspecting
collections of coded segments. Because of the way we had coded the data, the features of the data
landscape were highlighted on the transcript, the “threads” were visible. Through the combination
of: 1) focused attention, 2) intensive analysis of a small part of the data, and 3) the ability to see the
how several “threads” or “features” of the data came together over several pages in the transcript, we
were able to make the “progress crisis” discovery.
Qualitative Data Analysis 11
1. Arrows which represent the basic Notice, Collect, Think process described in the
previous section.
2. Three Boxes which represent the three basic processes of The Ethnograph: Import and
Number, Code a Data File, and Search for Coded Segments.
While the placement of these arrows suggests that the process is progressive and linear, the diagram
preserves the nonlinear, iterative, and recursive aspects of the process as discussed in the previous
section.
This latter path is represented by the dark arrow going from the Code Data Files Box to the
Discoveries box.
Some types of discoveries are represented in the Discoveries box at the lower right hand
corner of the diagram. Discoveries can be patterns, sequences, processes, wholes, classes, types, and
categories.
Objectivist Codes
An objectivist approach treats code words as “condensed representation of the facts described in the
data” (Seidel and Kelle, 1995). Given this assumption, code words can be treated as surrogates for
the text, and the analysis can focus on the codes instead of the text itself. You can then emulate
traditional distributional analysis and hypothesis testing for qualitative data. But first you must be
able to trust your code words.
To trust a code word you need: 1) to guarantee that every time you use a code word to identify a
segment of text that segment is an unambiguous instance of what that code word represents, 2) to
guarantee that you applied that code word to the text consistently in the traditional sense of the
concept of reliability, and 3) to guarantee that you have identified every instance of what the code
represents.
If the above conditions are met, then: 1) the codes are adequate surrogates for the text they identify,
2) the text is reducible to the codes, and 3) it is appropriate to analyze relationships among codes. If
you fall short of meeting these conditions then an analysis of relationships among code words is
risky business. I have identified some of these risks in an earlier work (Seidel, 1991).
Heuristic Codes
In a heuristic approach, code words are primarily flags or signposts that point to things in the data.
The role of code words is to help you collect the things you have noticed so you can subject them to
further analysis. Heuristic codes help you reorganize the data and give you different views of the
data. They facilitate the discovery of things, and they help you open up the data to further intensive
analysis and inspection.
The burdens placed on heuristic codes are much less than those placed on objective codes. In a
heuristic approach code words more or less represent the things you have noticed. You have no
assurance that the things you have coded are always the same type of thing, nor that you have
captured every possible instance of that thing in your coding of the data. This does not absolve you of
the responsibility to refine and develop your coding scheme and your analysis of the data. Nor does it
excuse you from looking for “counter examples” and “confirming examples” in the data. The
heuristic approach does say that coding the data is never enough. It is the beginning of a process
that requires you to work deeper and deeper into your data.
Further, heuristic code words change and evolve as the analysis develops. The way you use the same
code word changes over time. Text coded at time one is not necessarily equivalent with text coded at
time two. Finally, heuristic code words change and transform the researcher who, in turn, changes
and transforms the code words as the analysis proceeds.
Agar, Michael (1991) “The right brain strikes back”, in Using Computers in Qualitative Research,
N. Fielding and R. Lee, eds., Newbury Park: Sage Publications, 181-194.
Charmaz, Kathy (1983) "The grounded theory method: An explication and interpretation", in
Contemporary field Research: A Collection of Readings, Robert M. Emerson, ed., Boston: Little,
Brown and Company, 109-128.
Corbin, Juliet and Anselm Strauss (1990) Basics of Qualitative Research: Grounded Theory
Procedures and Techniques, Newbury Park, CA: Sage Publications.
Freidson, Elliot (1975) Doctoring together: A Study of Professional Social Control, Chicago:
University of Chicago Press.
Jorgensen, Danny L. (1989) Participant Observation: A Methodology for Human Studies, Newbury
Park, CA: Sage Publications.
Schneider, Joseph W. and Peter Conrad (1983) Having Epilepsy: The Experience and Control of
Illness, Philadelphia: Temple University Press.
Seidel, John (1991) “Method and Madness in the Application of Computer Technology to
Qualitative Data Analysis”, in Using Computers in Qualitative Research, N. Fielding and R. Lee,
eds., Newbury Park: Sage Publications, 107-116.
Seidel, John and Klaus Udo Kelle, and (1995) “Different Functions of Coding in the Analysis of
Data.” in K.U.Kelle ed., Computer Aided Qualitative Data Analysis: Theory, Methods, and Practice.
Thousand Oaks, CA: Sage Publications.
Spradley, James P. (1979) The Ethnographic Interview, New York: Holt Rinehart, and Wilson.
Wiseman, Jacqueline P. (1979) Stations of the Lost: The Treatment of Skid Row Alcoholics,
Chicago: The University of Chicago Press.