Many times python will receive data from various sources which can be in different formats like csv, JSON etc which can be converted to python list or dictionaries etc. But to apply the calculations or analysis using packages like pandas, we need to convert this data into a dataframes. In this article we will see how we can convert a given python list whose elements are a nested dictionary, into a pandas Datframe.
We first take the list of nested dictionary and extract the rows of data from it. Then we create another for loop to append the rows into the new list which was originally created empty. Finally we apply the DataFrames function in the pandas library to create the Data Frame.
Example
import pandas as pd # Given nested dictionary list = [ { "Fruit": [{"Price": 15.2, "Quality": "A"}, {"Price": 19, "Quality": "B"}, {"Price": 17.8, "Quality": "C"}, ], "Name": "Orange" }, { "Fruit": [{"Price": 23.2, "Quality": "A"}, {"Price": 28, "Quality": "B"} ], "Name": "Grapes" } ] rows = [] # Getting rows for data in list: data_row = data['Fruit'] n = data['Name'] for row in data_row: row['Name'] = n rows.append(row) # Convert to data frame df = pd.DataFrame(rows) print(df)
Running the above code gives us the following result −
Output
Price Quality Name 0 15.2 A Orange 1 19.0 B Orange 2 17.8 C Orange 3 23.2 A Grapes 4 28.0 B Grapes
Applying pivot
We can also apply the pivot_table function to re-organize the data the way we want.
Example
import pandas as pd # List of nested dictionary initialization list = [ { "Fruit": [{"Price": 15.2, "Quality": "A"}, {"Price": 19, "Quality": "B"}, {"Price": 17.8, "Quality": "C"}, ], "Name": "Orange" }, { "Fruit": [{"Price": 23.2, "Quality": "A"}, {"Price": 28, "Quality": "B"} ], "Name": "Grapes" } ] #print(list) rows = [] # appending rows for data in list: data_row = data['Fruit'] n = data['Name'] for row in data_row: row['Name'] = n rows.append(row) # using data frame df = pd.DataFrame(rows) df = df.pivot_table(index='Name', columns=['Quality'], values=['Price']).reset_index() print(df)
Running the above code gives us the following result −
Output
Name Price Quality A B C 0 Grapes 23.2 28.0 NaN 1 Orange 15.2 19.0 17.8