Unnest (Explode) Multiple List Columns In A Pandas Dataframe
Last Updated :
23 Jul, 2025
An open-source manipulation tool that is used for handling data is known as Pandas. Have you ever encountered a dataset that has columns with data as a list? In such cases, there is a necessity to split that column into various columns, as Pandas cannot handle such data. In this article, we will discuss the same, i.e., unnest or explode multiple list columns into a Pandas data frame.
Unnest (Explode) Multiple List Columns In A Pandas Dataframe
What are Pandas?
Pandas is an open-source data manipulation and analysis tool built on top of the Python programming language. It provides powerful data structures, such as DataFrame and Series, that allow users to easily manipulate and analyze data.
What are nested list columns?
Nested list columns are columns in a DataFrame where each cell contains a list of values, rather than a single scalar value. This occurs when the data is structured hierarchically, with each cell representing a collection of related sub-values.
Why to unnest multiple list columns?
Decoupling multiple list columns in a data frame can be useful for several reasons:
- Data simplification: Unnesting converts complex nested data into a simpler tabular form, making it easier to understand and manipulate.Improved analysis: Nested data can be better analyzed with Panda and other data analysis tools. This allows data to be more easily combined, filtered and processed.
- Improved visualization: Nested data can be visualized more effectively, allowing better understanding to be conveyed through charts, graphs, and charts.
- Compatibility: Nested data is often needed for certain types of analysis, such as machine learning modeling, which typically requires tabular data as input.
- Data integration: Decoupling can facilitate the integration of data from different sources or systems by aligning the data structure with a more standard table format.
- Normalization: Content separation can be a step towards data normalization that can improve data quality and reduce redundancy..
Efficient ways to unnest multiple list columns in a Pandas dataframe:
- Using the explode function
- Using pandas.series.explode function
- Using pandas.series with lambda function
Using the explode function
The way of flattening nested Series objects and DataFrame columns by splitting their content into multiple rows is known as the explode function. In this method, we will see how we can unnest multiple list columns using the explode function.
Syntax:
df=df.explode(['Favourite Ice-cream', 'Favourite Soft-Drink']).reset_index(drop=True)
Here,
- column-1, column-2: These are the columns that you want to unnest.
- df: It is the data frame that has those nested columns.
Implementations:
In this example, we have created a dataset, which has three columns, Name, Favourite Ice-cream and Favourite Soft-Drink, out of which Favourite Ice-cream and Favourite Soft-Drink columns are nested. We have unnested those columns using the explode function.
Python3
# Import the Pandas library
import pandas as pd
# Create a data frame that has nested columns
df = pd.DataFrame({'Name': ['Arun', 'Aniket', 'Ishita', 'Raghav', 'Vinayak'],
'Favourite Ice-cream': [['Strawberry', 'Choco-chips'],
['Vanilla', 'Black Currant'],
['Butterscotch', 'Chocolate'],
['Mango', 'Choco-chips'],
['Kulfi', 'Kaju-Kishmish']],
'Favourite Soft-Drink': [['Coca Cola', 'Lemonade'],
['Thumbs Up', 'Sprite'],
['Moutain Dew', 'Fanta'],
['Mirinda', 'Maaza'],
['7Up', 'Sprite']]})
# Print the actual data frame
print('Actual dataframe:\n', df)
# Unnest the nested columns
df = df.explode(['Favourite Ice-cream', 'Favourite Soft-Drink']
).reset_index(drop=True)
# Print the unnested data frame
print('\nDataframe after unnesting:\n', df)
Output:
Actual dataframe:
Name Favourite Ice-cream Favourite Soft-Drink
0 Arun [Strawberry, Choco-chips] [Coca Cola, Lemonade]
1 Aniket [Vanilla, Black Currant] [Thumbs Up, Sprite]
2 Ishita [Butterscotch, Chocolate] [Moutain Dew, Fanta]
3 Raghav [Mango, Choco-chips] [Mirinda, Maaza]
4 Vinayak [Kulfi, Kaju-Kishmish] [7Up, Sprite]
Dataframe after unnesting:
Name Favourite Ice-cream Favourite Soft-Drink
0 Arun Strawberry Coca Cola
1 Arun Choco-chips Lemonade
2 Aniket Vanilla Thumbs Up
3 Aniket Black Currant Sprite
4 Ishita Butterscotch Moutain Dew
5 Ishita Chocolate Fanta
6 Raghav Mango Mirinda
7 Raghav Choco-chips Maaza
8 Vinayak Kulfi 7Up
9 Vinayak Kaju-Kishmish Sprite
Using pandas.series.explode function
The function that splits a series object containing list-like values into multiple rows, one for each element in the list is known as pandas.series.explode function. In this method, we will see how we can unnest multiple list columns using the pandas.series.explode function.
Syntax:
df=df.set_index(['column-3']).apply(pd.Series.explode).reset_index()
Here,
- column-3: It is the column that is already unnested.
- df: It is the data frame that has those nested columns.
Implementations:
In this example, we have created a dataset, which has three columns, Name, Favourite Ice-cream and Favourite Soft-Drink, out of which Favourite Ice-cream and Favourite Soft-Drink columns are nested. We have unnested those columns using pandas.series.explode function.
Python3
# Import the Pandas library
import pandas as pd
# Create a data frame that has nested columns
df = pd.DataFrame({'Name': ['Arun','Aniket','Ishita', 'Raghav','Vinayak'],
'Favourite Ice-cream':[['Strawberry', 'Choco-chips'],
['Vanilla', 'Black Currant'],
['Butterscotch', 'Chocolate'],
['Mango', 'Choco-chips'],
['Kulfi', 'Kaju-Kishmish']],
'Favourite Soft-Drink':[['Coca Cola', 'Lemonade'],
['Thumbs Up', 'Sprite'],
['Moutain Dew', 'Fanta'],
['Mirinda', 'Maaza'],
['7Up', 'Sprite']]})
# Print the actual data frame
print ('Actual dataframe:\n',df)
# Unnest the nested columns
df=df.set_index(['Name']).apply(pd.Series.explode).reset_index()
# Print the unnested data frame
print ('\nDataframe after unnesting:\n',df)
Output:
Actual dataframe:
Name Favourite Ice-cream Favourite Soft-Drink
0 Arun [Strawberry, Choco-chips] [Coca Cola, Lemonade]
1 Aniket [Vanilla, Black Currant] [Thumbs Up, Sprite]
2 Ishita [Butterscotch, Chocolate] [Moutain Dew, Fanta]
3 Raghav [Mango, Choco-chips] [Mirinda, Maaza]
4 Vinayak [Kulfi, Kaju-Kishmish] [7Up, Sprite]
Dataframe after unnesting:
Name Favourite Ice-cream Favourite Soft-Drink
0 Arun Strawberry Coca Cola
1 Arun Choco-chips Lemonade
2 Aniket Vanilla Thumbs Up
3 Aniket Black Currant Sprite
4 Ishita Butterscotch Moutain Dew
5 Ishita Chocolate Fanta
6 Raghav Mango Mirinda
7 Raghav Choco-chips Maaza
8 Vinayak Kulfi 7Up
9 Vinayak Kaju-Kishmish Sprite
Using pandas.series with lambda function
An anonymous function that can take any number of arguments, but can only have one expression is known as lambda function. In this method, we will see how we can unnest multiple list columns using the pandas.series with lambda function.
Syntax:
df=df.set_index('Name').apply(lambda x: x.apply(pd.Series).stack()).reset_index().drop('level_1', 1)
Here,
- column-3: It is the column that is already unnested.
- df: It is the data frame that has those nested columns.
Implementations:
In this example, we have created a dataset, which has three columns, Name, Favourite Ice-cream and Favourite Soft-Drink, out of which Favourite Ice-cream and Favourite Soft-Drink columns are nested. We have unnested those columns using pandas.series with lambda function.
Python3
# Import the Pandas library
import pandas as pd
# Create a data frame that has nested columns
df = pd.DataFrame({'Name': ['Arun','Aniket','Ishita', 'Raghav','Vinayak'],
'Favourite Ice-cream':[['Strawberry', 'Choco-chips'],
['Vanilla', 'Black Currant'],
['Butterscotch', 'Chocolate'],
['Mango', 'Choco-chips'],
['Kulfi', 'Kaju-Kishmish']],
'Favourite Soft-Drink':[['Coca Cola', 'Lemonade'],
['Thumbs Up', 'Sprite'],
['Moutain Dew', 'Fanta'],
['Mirinda', 'Maaza'],
['7Up', 'Sprite']]})
# Print the actual data frame
print ('Actual dataframe:\n',df)
# Unnest the nested columns
df=df.set_index('Name').apply(
lambda x: x.apply(pd.Series).stack()).reset_index().drop('level_1', 1)
# Print the unnested data frame
print ('\nDataframe after unnesting:\n',df)
Output:
Actual dataframe:
Name Favourite Ice-cream Favourite Soft-Drink
0 Arun [Strawberry, Choco-chips] [Coca Cola, Lemonade]
1 Aniket [Vanilla, Black Currant] [Thumbs Up, Sprite]
2 Ishita [Butterscotch, Chocolate] [Moutain Dew, Fanta]
3 Raghav [Mango, Choco-chips] [Mirinda, Maaza]
4 Vinayak [Kulfi, Kaju-Kishmish] [7Up, Sprite]
Dataframe after unnesting:
Name Favourite Ice-cream Favourite Soft-Drink
0 Arun Strawberry Coca Cola
1 Arun Choco-chips Lemonade
2 Aniket Vanilla Thumbs Up
3 Aniket Black Currant Sprite
4 Ishita Butterscotch Moutain Dew
5 Ishita Chocolate Fanta
6 Raghav Mango Mirinda
7 Raghav Choco-chips Maaza
8 Vinayak Kulfi 7Up
9 Vinayak Kaju-Kishmish Sprite
Similar Reads
Non-linear Components In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
Linear Regression in Machine learning Linear regression is a type of supervised machine-learning algorithm that learns from the labelled datasets and maps the data points with most optimized linear functions which can be used for prediction on new datasets. It assumes that there is a linear relationship between the input and output, mea
15+ min read
Spring Boot Tutorial Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance
10 min read
Logistic Regression in Machine Learning Logistic Regression is a supervised machine learning algorithm used for classification problems. Unlike linear regression which predicts continuous values it predicts the probability that an input belongs to a specific class. It is used for binary classification where the output can be one of two po
11 min read
Class Diagram | Unified Modeling Language (UML) A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact
12 min read
K-Nearest Neighbor(KNN) Algorithm K-Nearest Neighbors (KNN) is a supervised machine learning algorithm generally used for classification but can also be used for regression tasks. It works by finding the "k" closest data points (neighbors) to a given input and makesa predictions based on the majority class (for classification) or th
8 min read
K means Clustering â Introduction K-Means Clustering is an Unsupervised Machine Learning algorithm which groups unlabeled dataset into different clusters. It is used to organize data into groups based on their similarity. Understanding K-means ClusteringFor example online store uses K-Means to group customers based on purchase frequ
4 min read
Python Variables In Python, variables are used to store data that can be referenced and manipulated during program execution. A variable is essentially a name that is assigned to a value. Unlike many other programming languages, Python variables do not require explicit declaration of type. The type of the variable i
6 min read
Spring Boot Interview Questions and Answers Spring Boot is a Java-based framework used to develop stand-alone, production-ready applications with minimal configuration. Introduced by Pivotal in 2014, it simplifies the development of Spring applications by offering embedded servers, auto-configuration, and fast startup. Many top companies, inc
15+ min read
Backpropagation in Neural Network Back Propagation is also known as "Backward Propagation of Errors" is a method used to train neural network . Its goal is to reduce the difference between the modelâs predicted output and the actual output by adjusting the weights and biases in the network.It works iteratively to adjust weights and
9 min read