Python - Flatten Nested Keys
Last Updated :
06 Apr, 2023
Sometimes, while working with Python data, we can have a problem in which we need to perform the flattening of certain keys in nested list records. This kind of problem occurs while data preprocessing. Let us discuss certain ways in which this task can be performed.
Method #1: Using loop
This is a brute-force method to perform this task. In this, we construct new dictionary by assigning base keys and then perform the flattening of inner key elements using a nested loop.
Python3
# Python3 code to demonstrate working of
# Flatten Nested Keys
# Using loop
# initializing list
test_list = [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4},
{'rating': 17, 'price': 8}]},
{'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}]
# printing original list
print("The original list is : " + str(test_list))
# Flatten Nested Keys
# Using loop
res = []
for sub in test_list:
temp1 = {
'Gfg': sub['Gfg'],
'id': sub['id']
}
for data in sub.get('data', []):
res.append({
**temp1,
'rating': data['rating'],
'price': data['price']})
# printing result
print("The flattened list : " + str(res))
OutputThe original list is : [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}]}, {'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}]
The flattened list : [{'Gfg': 1, 'id': 1, 'rating': 7, 'price': 4}, {'Gfg': 1, 'id': 1, 'rating': 17, 'price': 8}, {'Gfg': 1, 'id': 2, 'rating': 18, 'price': 19}]
Time Complexity: O(n * m), where n is the number of elements in the outer list and m is the average number of elements in each inner 'data' list.
Auxiliary Space: O(n), where n is the number of elements in the outer list. This is because we use a single res list to store the flattened elements and its size grows with each iteration of the loop.
Method #2: Using list comprehension + zip() + itemgetter()
The combination of the above functions can be used to perform this task. In this, we extract the required pairs using itemgetter() and combine pairs using zip(). The compilation of data is using list comprehension.
Python3
# Python3 code to demonstrate working of
# Flatten Nested Keys
# Using list comprehension + zip() + itemgetter()
from operator import itemgetter
# initializing list
test_list = [{'Gfg' : 1, 'id' : 1, 'data' : [{'rating' : 7, 'price' : 4},
{'rating' : 17, 'price' : 8}]},
{'Gfg' : 1, 'id' : 2, 'data' : [{'rating' : 18, 'price' : 19}]}]
# printing original list
print("The original list is : " + str(test_list))
# initializing base and flatten keys
base_keys = 'Gfg', 'id'
flatten_keys = 'rating', 'price'
# Flatten Nested Keys
# Using list comprehension + zip() + itemgetter()
res = [dict(zip(base_keys + flatten_keys, itemgetter(*base_keys)(sub) + itemgetter(*flatten_keys)(data))) for sub in test_list for data in sub['data']]
# printing result
print("The flattened list : " + str(res))
Output :
The original list is : [{'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}], 'id': 1, 'Gfg': 1}, {'data': [{'rating': 18, 'price': 19}], 'id': 2, 'Gfg': 1}] The flattened list : [{'price': 4, 'rating': 7, 'id': 1, 'Gfg': 1}, {'price': 8, 'rating': 17, 'id': 1, 'Gfg': 1}, {'price': 19, 'rating': 18, 'id': 2, 'Gfg': 1}]
Time complexity: O(n) where n is the total number of elements in the nested list. The reason for this is that the code needs to iterate through the elements of the nested list in the list comprehension.
Auxiliary space: O(n) as the result list will have n elements. This is because the result list is being constructed by iterating through the elements of the nested list and adding new elements to it.
Method #3 : Using update() method
Python3
# Python3 code to demonstrate working of
# Flatten Nested Keys
# Using loop
# initializing list
test_list = [{'Gfg' : 1, 'id' : 1, 'data' : [{'rating' : 7, 'price' : 4},
{'rating' : 17, 'price' : 8}]},
{'Gfg' : 1, 'id' : 2, 'data' : [{'rating' : 18, 'price' : 19}]}]
# printing original list
print("The original list is : " + str(test_list))
# Flatten Nested Keys
# Using loop
res = []
for sub in test_list:
for j in sub["data"]:
j.update({"id":sub["id"]})
j.update({"Gfg":sub["Gfg"]})
res.append(j)
# printing result
print("The flattened list : " + str(res))
OutputThe original list is : [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}]}, {'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}]
The flattened list : [{'rating': 7, 'price': 4, 'id': 1, 'Gfg': 1}, {'rating': 17, 'price': 8, 'id': 1, 'Gfg': 1}, {'rating': 18, 'price': 19, 'id': 2, 'Gfg': 1}]
Time complexity: O(n*m), where n is the length of the input list and m is the average length of the "data" list in each dictionary.
Auxiliary Space: O(nm), because we are creating a new list to store the flattened data, which can have a size of up to nm elements.
Method #4: Using recursion
We can also flatten the nested keys using recursion. Here's how we can implement it:
- Define a function flatten_keys that takes a dictionary as an input.
- Initialize an empty dictionary result.
- Loop through each key-value pair in the dictionary.
- If the value is a dictionary, call flatten_keys recursively with the value as the input and store the result in result.
- If the value is a list, loop through each element in the list and call flatten_keys recursively with the element as the input and store the result in result.
- If the value is not a dictionary or list, add it to the result dictionary with the same key.
- Return the result dictionary.
Python3
def flatten_keys(d):
result = {}
for k, v in d.items():
if isinstance(v, dict):
flat_v = flatten_keys(v)
for flat_k, flat_v in flat_v.items():
result[k + '.' + flat_k] = flat_v
elif isinstance(v, list):
for i, item in enumerate(v):
flat_item = flatten_keys(item)
for flat_k, flat_v in flat_item.items():
result[f"{k}.{i}.{flat_k}"] = flat_v
else:
result[k] = v
return result
# input list
test_list = [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4},
{'rating': 17, 'price': 8}]},
{'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}]
flattened_list = []
for d in test_list:
flattened_dict = flatten_keys(d)
flattened_list.append(flattened_dict)
print("The flattened list : " + str(flattened_list))
OutputThe flattened list : [{'Gfg': 1, 'id': 1, 'data.0.rating': 7, 'data.0.price': 4, 'data.1.rating': 17, 'data.1.price': 8}, {'Gfg': 1, 'id': 2, 'data.0.rating': 18, 'data.0.price': 19}]
Time complexity of this approach is O(n), where n is the number of elements in the nested dictionary.
Auxiliary space required is also O(n), where n is the number of elements in the flattened dictionary.
Method #5: Using a stack
We can flatten the nested dictionaries using a stack data structure. We start with an empty dictionary and a stack containing the input dictionary. We pop a dictionary from the stack, iterate through its key-value pairs, and if the value is a dictionary, we add the current key to each of its keys and push it onto the stack. If the value is a list, we iterate through each element of the list and add the current index and key to its keys and push it onto the stack. Otherwise, we add the key-value pair to the flattened dictionary.
step-by-step approach :
- Define a function named flatten_keys_stack that takes in a dictionary as its only argument.
- Create an empty dictionary named flattened_dict to hold the flattened key-value pairs.
- Create a list named stack containing a tuple of the input dictionary and an empty string (representing the current prefix for keys).
- While the stack is not empty, pop a tuple from the stack and unpack it into curr_dict and prefix.
- Iterate through the key-value pairs of curr_dict.
- If the value is a dictionary, add the current key to the prefix and push the dictionary and new prefix onto the stack.
- If the value is a list, iterate through each element of the list, add the current index and key to the prefix, and push the element and new prefix onto the stack.
- Otherwise, add the prefix and current key as a concatenated string as the key to flattened_dict, with the value being the current value.
- Return flattened_dict.
- Modify the existing code to use the flatten_keys_stack function instead of the flatten_keys function.
- Run the code and verify that the flattened list is the same as the previous methods.
Python3
def flatten_keys_stack(d):
flattened_dict = {}
stack = [(d, '')]
while stack:
curr_dict, prefix = stack.pop()
for k, v in curr_dict.items():
if isinstance(v, dict):
stack.append((v, prefix + k + '.'))
elif isinstance(v, list):
for i, item in enumerate(v):
stack.append((item, prefix + k + '.' + str(i) + '.'))
else:
flattened_dict[prefix + k] = v
return flattened_dict
# input list
test_list = [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4},
{'rating': 17, 'price': 8}]},
{'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}]
flattened_list = []
for d in test_list:
flattened_dict = flatten_keys_stack(d)
flattened_list.append(flattened_dict)
print("Flattened dictionary: " + str(flattened_dict))
print("The flattened list: " + str(flattened_list))
OutputFlattened dictionary: {'Gfg': 1, 'id': 1, 'data.1.rating': 17, 'data.1.price': 8, 'data.0.rating': 7, 'data.0.price': 4}
Flattened dictionary: {'Gfg': 1, 'id': 2, 'data.0.rating': 18, 'data.0.price': 19}
The flattened list: [{'Gfg': 1, 'id': 1, 'data.1.rating': 17, 'data.1.price': 8, 'data.0.rating': 7, 'data.0.price': 4}, {'Gfg': 1, 'id': 2, 'data.0.rating': 18, 'data.0.price': 19}]
The time and auxiliary space complexity of this method are both O(N), where N is the total number of key-value pairs in the input dictionary.
The auxiliary space comes from the use of the stack to traverse the nested dictionaries and lists.
Method 6 : Uses the flatten_dict_pandas() function:
step-by-step approach for the program:
- Import the pandas library with the alias pd.
- Define the flatten_dict_pandas() function that takes a dictionary d as input.
- Inside the function, use pd.json_normalize() to convert the dictionary to a flattened pandas DataFrame, with the keys joined by dots as the column names. The sep parameter specifies the separator to use between the keys.
- Then, use the to_dict() method of the DataFrame to convert it back to a dictionary, with orient='records' to return a list of dictionaries, and take the first dictionary from the list.
- Return the flattened dictionary.
- Define an input list test_list that contains nested dictionaries and lists.
- Create an empty list flattened_list to store the flattened dictionaries.
- Loop through each dictionary d in test_list.
- Call flatten_dict_pandas() on d to flatten it.
- Append the flattened dictionary to flattened_list.
- Print the flattened dictionary.
- After the loop, print the entire flattened list.
Python3
import pandas as pd
def flatten_dict_pandas(d):
df = pd.json_normalize(d, sep='.')
return df.to_dict(orient='records')[0]
# input list
test_list = [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4},
{'rating': 17, 'price': 8}]},
{'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}]
flattened_list = []
for d in test_list:
flattened_dict = flatten_dict_pandas(d)
flattened_list.append(flattened_dict)
print("Flattened dictionary: " + str(flattened_dict))
print("The flattened list: " + str(flattened_list))
OUTPUT:
Flattened dictionary: {'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}]}
Flattened dictionary: {'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}
The flattened list: [{'Gfg': 1, 'id': 1, 'data': [{'rating': 7, 'price': 4}, {'rating': 17, 'price': 8}]}, {'Gfg': 1, 'id': 2, 'data': [{'rating': 18, 'price': 19}]}]
The time complexity of this method is O(nlogn), where n is the total number of keys and values in the dictionary.
The auxiliary space is also O(nlogn), due to the creation of the pandas DataFrame.
Similar Reads
Python - Flatten Nested Tuples Sometimes, while working with Python Tuples, we can have a problem in which we need to perform flattening of tuples, which can be nested and undesired. This can have application across many domains such as Data Science and web development. Let's discuss certain way in which this task can be performe
7 min read
Flatten a List of Lists in Python Flattening a list of lists means turning a nested list structure into a single flat list. This can be useful when we need to process or analyze the data in a simpler format. In this article, we will explore various approaches to Flatten a list of Lists in Python.Using itertools.chain itertools modul
3 min read
Python - Inversion in nested dictionary Given a nested dictionary, perform inversion of keys, i.e innermost nested becomes outermost and vice-versa. Input : test_dict = {"a" : {"b" : {}}, "d" : {"e" : {}}, "f" : {"g" : {}} Output : {'b': {'a': {}}, 'e': {'d': {}}, 'g': {'f': {}} Explanation : Nested dictionaries inverted as outer dictiona
3 min read
Python - Nested Records List from Lists Sometimes, while working with Python Data, we can have problems in which we have data incoming in different formats. In this, we can receive data as key and value in separate dictionaries and we are required to make values as list of records with a new key. Let's discuss certain ways in which we can
6 min read
Dictionary keys as a list in Python In Python, we will encounter some situations where we need to extract the keys from a dictionary as a list. In this article, we will explore various easy and efficient methods to achieve this.Using list() The simplest and most efficient way to convert dictionary keys to lists is by using a built-in
2 min read
Python | Convert a nested list into a flat list In this article, we will cover how to Flatten a List of Lists in python. To convert a nested list into a flat list we are going to see some examples. Example: Input : l = [1, 2, [3, 4, [5, 6] ], 7, 8, [9, [10] ] ] Output : l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] Input : l = [[['item1', 'item2']], [['ite
5 min read