Open Coding PDF
Open Coding PDF
Introduction
We need to give names to our ideas and concepts to define, analyze and share with others. Once its
defined, we can begin to examine them comparatively and ask questions to systematically specify the
states and to imply possible relations with others. Its also important that we name our concepts
appropriately; because people act toward things based on the meaning those things have for them;
and these meanings are derived from social interaction and modified through interpretation.[1]
To build concepts from a textual data source, we need to open up the text and expose the meaning, idea
and thoughts in it. One of the processes of analyzing textual content is Open Coding. Open Coding
includes labeling concepts, defining and developing categories based on their properties and
dimensions. It is used to analyze qualitative data and part of many Qualitative Data Analysis
methodologies like Grounded Theory.
Building Concepts
The first step in qualitative data analysis is to go through the data (i.e. text) to break down in to pieces to
examine closely, compare for relations, similarities and dissimilarities. Different parts of the data are
marked with appropriate labels or codes to identify them for further analysis.
A concept is a labeled section of data that a researcher identifies as significant to some facts that data
represent. Concepts are abstract representations of events, objects, actions or interactions and they
allow researchers to group similar information to better understand the data.
Concepts can be of various types; communication, storm or private company includes example of
concepts. Concepts may incite certain natural imagery as they have their own properties. For example,
we can think of data set representing telephone conversations between two participants and we can
label them as Telephone Communication. So a labeled thing is something that can be location and
placed in a class of similar objects. Anything under a classification has one or more familiar properties or
characteristics; like sending information is a property of communication. It is important to understand
that concepts can be classified differently, it depends on the different properties of data the researcher
is focusing on and how he/she is translating them.
Description
Wording that participants use in the interview
Coded data from in vivo codes,
Created by the researcher,
Academic terms
Table 1: Different naming strategy for codes
While analyzing the data, we sometimes get events or objects with common characteristics yet other
properties may separate them. We can use the common properties to group them under same concept.
Different researchers may think of different names from the same data set but in general, it should be
based on the context.
Following is a partial transcript of an interview with women in her early 20s and is about drug use by
teens. The interviewer did not have preset questions to ask. He continued his questions based on the
interviewees response.
Interviewer: Tell me about teens and drug use.
Respondent: I think teens use drugs as a release from their parents [rebellious act]. Well, I dont know.
I can only talk for myself. For me, it was an experience [experience] [in vivo code]. You hear a lot about
drugs [drug talk]. You hear they are bad for you [negative connotation to the drug talk].
Source: Basics of Qualitative Research, Second Edition by Anselm Strauss & Juliet Corbin [6]
As you can see in the above interview transcript, we have grouped the similar information using abstract
labels (i.e. drug talk). Some of the names for the labels are selected directly from the data (i.e. hard-core
use). This process of going through line by line data to assign codes is called line-by-line coding.
Glaser (1978) offered guidelines for preparing effective memos to generate substantive theory including
the following [3]:
Source: Nursing research: principles and methods by Denise F. Polit & Cheryl Tatano Beck
Defining Categories
As we continue to create codes for new concepts, its not unexpected to come to a point when we will
have more than few pages of codes. At that stage, we should analyze the codes to find the similarities
and group them into categories based on their common properties. We may also consider dimensions of
the codes that represent the location of the property along a continuum or range. The name of the
category can be different from the codes to express its scope better and if necessary, we can also create
sub-categories from the codes then link to categories.
Group conversation helps to take important decisions (i.e. single out phenomena for coding,
decide which existing concepts to use for coding or when to create new concept) [Berlin]
Concept definitions become more exact and differentiations get more precise
The data perspective is maintained more consistently
Generally, more number of phenomena are discovered and processed.
Exercise
Now that we know how to do Open Coding, lets try to use it. Following is part of an interview transcript
with a woman in her 20s and is about drug use by teens. We would like to use open coding to analyze
the data.
Interviewer: Tell me about teens and drug use.
Respondent: I think teens use drugs as a release from their parents. Well, I dont know. I can only talk for
myself. For me, it was an experience. You hear a lot about drugs. You hear they are bad for you. There is
a lot of them around. You just get into them because theyre accessible and because its kind of a new
thing. Its cool! You know, its something that is bad for you, taboo, a no. Everyone is against it. If you
are a teenager, the first thing you are going to do is try them.
Interviewer: Do teens experiment a lot with drugs?
Respondent: Most just try a few. It depends on where you are and how accessible they are. Most dont
really get into in hard-core. A lot of teens are into pot, hash, a little organic staff. It depends on what
phase of life you are at. Its kind of progressive. You start off with the basic drugs like pot. Then you go on
to try more intense drugs like hallucinogens.
So, how are we going to do that? One option is to printout the content, high light the important
concepts and write the codes. But a manual open coding approach is not a good process especially when
we have to deal with large amount of data. As you are limited to search by reading only, it can also
cause unwanted errors. This process is simply impractical for a large scale data analysis with open
coding.
from University of Calgary. We will use this tool in our exercise. A video tutorial for the tool can be
found here [4].
Now that we have software and know how to use it, lets start building concepts. Take a look into the
response: I think teens use drugs as a release from their parents. This looks like an act of rebellion. So
we can code it as rebellious act. The term use looks like meaning something more. If we takeout the
context of drug use for a second and think about it it may mean that they are being used for some
other reason which we are not sure at this state. So, we should take a note (Memo) for future
reference. We have to continue to analyze the data line-by-line and add codes as necessary as long as
we find significantly new concepts. Tools like Saturate can help us in both improving the efficiency and
better manage the data.
In the process of line-by-line coding, we will soon be able to group the concepts into categories like
drug use for concepts like hard-core use and soft core. Once we start getting too many old concepts,
we can stop labeling and move on to next step (i.e. selective coding, axial coding) based on our research
methodology.
Problems
Although Open Coding is an important tool for Qualitative Data Analysis but its also a very time
consuming and tedious work. Sometimes its hard decide when to stop line-by-line coding and if the
researcher misses any important concept, he/she may have to restart the boring task again.
Annotated Bibliography:
John V. Seidel. Qualitative Data Analysis [2]
This document was originally part of the manual for the Ethnograph v4. It explains the process of Open
Coding and also Qualitative Data Analysis in a broad sense.
Michael Nunes, Saul Greenberg, Carman Neustaedter. Using physical memorabilia as
opportunities to move into collocated digital photo-sharing [9]
A study on how physical memorabilia can be used as opportunities to move into home-based collocated
digital photo-sharing. The researcher used semi-structured contextual interviews, each approximately
one hour long. They used open coding as part of the data collection and analysis.
Sarker, S. Lau, F. Sahay, S. Building Inductive Theory of Collaboration in Virtual Teams: An
Adapted Grounded Theory Approach [8]
This paper outlined how the grounded theory was adapted to develop a theory of collaboration in
virtual teams. The researchers studied virtual teams composed of students from two different
universities and engaged in a 14 week long systems development projects. They analyzed the data using
adapted versions of open coding, axial coding and selective coding.
References
[1] Symbolic Interactionism. Bulmer H. (1969) [Link to Google Books]
[2] Qualitative Data Analysis. John V. Seidel [Link]
[3] Page 582, Nursing research: principles and methods by Denise F. Polit, Cheryl Tatano Beck
[Link to Google Books]
[4] Saturate, a web-based Open Coding tool developed by Dr. Sillito. University of Calgary
http://www.saturateapp.com
[5] Atlas.Ti, A commercial desktop application for Qualitative Data Analysis. http://www.atlasti.com/
[6] Chapter 8, Basics of Qualitative Research, Second Edition by Anselm Strauss & Juliet Corbin [Link]
[7] A Coding Scheme Development Methodology Using Grounded Theory for Qualitative Analysis of
Pair Programming. Institut fr Informatik, Freie Universitt Berlin. [Link]
[8] Building Inductive Theory of Collaboration in Virtual Teams: An Adapted Grounded Theory
Approach. Sarker, S. Lau, F. Sahay, S. [Link]
[9] Using physical memorabilia as opportunities to move into collocated digital photo-sharing.
Michael Nunes, Saul Greenberg, Carman Neustaedter. [Link]