How to Parse Nested JSON in Python
We are given a nested JSON object and our task is to parse it in Python. In this article, we will discuss multiple ways to parse nested JSON in Python using built-in modules and libraries like json, recursion techniques and even pandas.
What is Nested JSON
Nested JSON refers to a JSON object that contains another JSON object (or an array of objects) inside it.
Example:
{
"name": "John",
"age": 30,
"address": {
"city": "New York",
"zipcode": "10001"
}
}
In the above example, address is a nested JSON object.
In this article, we will discuss multiple ways to parse nested JSON in Python. Let's discuss them one by one:
Using the JSON module
In this example, we use the json
module to parse a nested JSON string. Subsequently, we access specific values within the JSON structure using dictionary keys, demonstrating how to retrieve information such as the name, age, city and zipcode.
import json
n_json = '{"name": "Prajjwal", "age": 23, "address": {"city": "Prayagraj", "zipcode": "20210"}}'
data = json.loads(n_json)
name = data['name']
age = data['age']
city = data['address']['city']
zipc = data['address']['zipcode']
print(f"Name: {name}")
print(f"Age: {age}")
print(f"City: {city}")
print(f"Zipcode: {zipc}")
import json
n_json = '{"name": "Prajjwal", "age": 23, "address": {"city": "Prayagraj", "zipcode": "20210"}}'
data = json.loads(n_json)
name = data['name']
age = data['age']
city = data['address']['city']
zipc = data['address']['zipcode']
print(f"Name: {name}")
print(f"Age: {age}")
print(f"City: {city}")
print(f"Zipcode: {zipc}")
Output
Name: Prajjwal Age: 23 City: Prayagraj Zipcode: 20210
Explanation:
- json.loads() converts the JSON string into a Python dictionary.
- Nested values (like "city") are accessed using key chaining: data['address']['city'].
- This approach is ideal when the structure of JSON is fixed and known in advance.
Using Recursion
In this example, the parse_json
function uses recursion to traverse the nested JSON structure and create a flattened dictionary. The parsed data is then accessed using keys to retrieve specific values such as name, age, city and zipcode from the original nested JSON data.
import json
n_json = '{"person": {"name": "Prajjwal", "age": 23, "address": {"city": "Prayagraj", "zipcode": "20210"}}}'
# Define a recursive function to parse nested JSON
def parse_json(data):
result = {}
for key, val in data.items():
if isinstance(val, dict):
result[key] = parse_json(val) # Recursive call
else:
result[key] = val
return result
data = parse_json(json.loads(n_json))
name = data['person']['name']
age = data['person']['age']
city = data['person']['address']['city']
zipc = data['person']['address']['zipcode']
print(f"Name: {name}")
print(f"Age: {age}")
print(f"City: {city}")
print(f"Zipcode: {zipc}")
import json
n_json = '{"person": {"name": "Prajjwal", "age": 23, "address": {"city": "Prayagraj", "zipcode": "20210"}}}'
# Define a recursive function to parse nested JSON
def parse_json(data):
result = {}
for key, val in data.items():
if isinstance(val, dict):
result[key] = parse_json(val) # Recursive call
else:
result[key] = val
return result
data = parse_json(json.loads(n_json))
name = data['person']['name']
age = data['person']['age']
city = data['person']['address']['city']
zipc = data['person']['address']['zipcode']
print(f"Name: {name}")
print(f"Age: {age}")
print(f"City: {city}")
print(f"Zipcode: {zipc}")
Output
Name: Prajjwal Age: 23 City: Prayagraj Zipcode: 20210
Explanation:
- parse_json() function recursively navigates each level of the JSON structure.
- This method is helpful when we don't know how deep is the nesting.
- It builds a dictionary with the same nested structure, making it easier to access specific values later.
Using the Pandas library
In this example, the pd.json_normalize
function from the Pandas library is utilized to flatten the nested JSON data into a Pandas DataFrame. The resulting DataFrame, df
, allows easy access to specific columns such as 'name' and 'age.'
import pandas as pd
import json
n_json = '{"employees": [{"name": "Prajjwal", "age": 23}, {"name": "Kareena", "age": 22}]}'
data = json.loads(n_json)
df = pd.json_normalize(data, 'employees')
names = df['name']
ages = df['age']
print("Names:", list(names))
print("Ages:", list(ages))
import pandas as pd
import json
n_json = '{"employees": [{"name": "Prajjwal", "age": 23}, {"name": "Kareena", "age": 22}]}'
data = json.loads(n_json)
df = pd.json_normalize(data, 'employees')
names = df['name']
ages = df['age']
print("Names:", list(names))
print("Ages:", list(ages))
Output
Names: ['Prajjwal', 'Kareena'] Ages: [23, 22]
Explanation:
- json.loads() first converts the JSON string to a Python object.
- pd.json_normalize() flattens the nested list into a table like structure.
- This is particularly useful for structured data like logs, records or API results with multiple entries.