Coding Qualitative Data
Coding Qualitative Data
Qualitative Research
How many hours have you spent sitting in front of Excel spreadsheets trying to find new
insights from customer feedback?
You know that asking open-ended survey questions gives you more actionable insights
than asking your customers for just a numerical Net Promoter Score (NPS). But when
you ask open-ended, free-text questions, you end up with hundreds (or even
thousands) of free-text responses.
How can you turn all of that text into quantifiable, applicable information about your
customers’ needs and expectations?
When coding customer feedback, you assign labels to words or phrases that represent
important (and recurring) themes in each response. These labels can be words,
phrases, or numbers; we recommend using words or short phrases, since they’re easier
to remember, skim, and organize.
Coding qualitative research to find common themes and concepts is part of thematic
analysis, which is part of qualitative data analysis. Thematic analysis extracts themes
from text by analyzing the word and sentence structure.
Researchers use coding and other qualitative data analysis processes to help them
make data-driven decisions based on customer feedback. When you use coding to
analyze your customer feedback, you can quantify the common themes in customer
language. This makes it easier to accurately interpret and analyze customer
satisfaction.
You can automate the coding of your qualitative data with thematic analysis
software (like Thematic). Thematic analysis and qualitative data analysis software use
machine learning, artificial intelligence (AI), and natural language processing (NLP) to
code your qualitative data and break text up into themes.
…all of which will save you time (and lots of unnecessary headaches) when analyzing
your customer feedback.
To learn more about how thematic analysis software helps you automate the data
coding process, check out this article.
For example, let’s say you’re conducting a survey on customer experience. You want to
understand the problems that arise from long call wait times, so you choose to make
“wait time” one of your codes before you start looking at the data.
The deductive approach can save time and help guarantee that your areas of interest
are coded. But you also need to be careful of bias; when you start with predefined
codes, you have a bias as to what the answers will be. Make sure you don’t miss other
important themes by focusing too hard on proving your own hypothesis.
Inductive coding, also called open coding, starts from scratch and creates codes
based on the qualitative data itself. You don’t have a set codebook; all codes arise
directly from the survey responses.
If you add a new code, split an existing code into two, or change the description of a
code, make sure to review how this change will affect the coding of all responses.
Otherwise, the same responses at different points in the survey could end up with
different codes.
Sounds like a lot of work, right? Inductive coding is an iterative process, which means it
takes longer and is more thorough than deductive coding. But it also gives you a more
complete, unbiased look at the themes throughout your data.
A flat coding frame assigns the same level of specificity and importance to each code.
While this might feel like an easier and faster method for manual coding, it can be
difficult to organize and navigate the themes and concepts as you create more and
more codes. It also makes it hard to figure out which themes are most important, which
can slow down decision making.
Hierarchical frames help you organize codes based on how they relate to one another.
For example, you can organize the codes based on your customers’ feelings on a
certain topic:
In this coding qualitative data example:
Hierarchical framing supports a larger code frame and lets you organize codes based
on organizational structure. It also allows for different levels of granularity in your
coding.
Whether your code frames are hierarchical or flat, your code frames should be flexible.
Manually analyzing survey data takes a lot of time and effort; make sure you can use
your results in different contexts.
For example, if your survey asks customers about customer service, you might only use
codes that capture answers about customer service. Then you realize that the same
survey responses have a lot of comments about your company’s products. To learn
more about what people say about your products, you may have to code all of the
responses from scratch! A flexible coding frame covers different topics and insights,
which lets you reuse the results later on.
Clean
Tidy
Dirty
Dusty
Looked like a dump
Could eat off the floor
Having only a few codes and hierarchical framing makes it easier to group different
words and phrases under one code. If you have too many codes, especially in a flat
frame, your results can become ambiguous and themes can overlap. Manual coding
also requires the coder to remember or be able to find all of the relevant codes; the
more codes you have, the harder it is to find the ones you need, no matter how
organized your codebook is.
Make accuracy a priority
Manually coding qualitative data means that the coder’s cognitive biases can influence
the coding process. For each study, make sure you have coding guidelines and training
in place to keep coding reliable, consistent, and accurate.
One thing to watch out for is definitional drift, which occurs when the data at the
beginning of the data set is coded differently than the material coded later. Check for
definitional drift across the entire dataset and keep notes with descriptions of how the
codes vary across the results.
If you have multiple coders working on one team, have them check one another’s
coding to help eliminate cognitive biases.
1. Coding is the process of labeling and organizing your qualitative data to identify themes.
After you code your qualitative data, you can analyze it just like numerical data.
2. Inductive coding (without a predefined code frame) is more difficult, but less prone to
bias, than deductive coding.
3. Code frames can be flat (easier and faster to use) or hierarchical (more powerful and
organized).
4. Your code frames need to be flexible enough that you can make the most of your results
and use them in different contexts.
5. When creating codes, make sure they cover several responses, contrast one another,
and strike a balance between too much and too little information.
6. Consistent coding = accuracy. Establish coding procedures and guidelines and keep an
eye out for definitional drift in your qualitative data analysis.