Create PySpark dataframe from nested dictionary Last Updated : 17 Jun, 2021 Comments Improve Suggest changes Like Article Like Report In this article, we are going to discuss the creation of Pyspark dataframe from the nested dictionary. We will use the createDataFrame() method from pyspark for creating DataFrame. For this, we will use a list of nested dictionary and extract the pair as a key and value. Select the key, value pairs by mentioning the items() function from the nested dictionary [Row(**{'': k, **v}) for k,v in data.items()] Example 1:Python program to create college data with a dictionary with nested address in dictionary Python3 # importing module import pyspark # importing sparksession from pyspark.sql module from pyspark.sql import SparkSession from pyspark.sql import Row # creating sparksession and giving an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # creating nested dictionary data = { 'student_1': { 'student id': 7058, 'country': 'India', 'state': 'AP', 'district': 'Guntur' }, 'student_2': { 'student id': 7059, 'country': 'Srilanka', 'state': 'X', 'district': 'Y' } } # taking row data rowdata = [Row(**{'': k, **v}) for k, v in data.items()] # creating the pyspark dataframe final = spark.createDataFrame(rowdata).select( 'student id', 'country', 'state', 'district') # display pyspark dataframe final.show() Output: +----------+--------+-----+--------+ |student id| country|state|district| +----------+--------+-----+--------+ | 7058| India| AP| Guntur| | 7059|Srilanka| X| Y| +----------+--------+-----+--------+ Example 2: Python program to create nested dictionaries with 3 columns(3 keys) Python3 # importing module import pyspark # importing sparksession from pyspark.sql module from pyspark.sql import SparkSession from pyspark.sql import Row # creating sparksession and giving an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # creating nested dictionary data = { 'student_1': { 'student id': 7058, 'country': 'India', 'state': 'AP' }, 'student_2': { 'student id': 7059, 'country': 'Srilanka', 'state': 'X' } } # taking row data rowdata = [Row(**{'': k, **v}) for k, v in data.items()] # creating the pyspark dataframe final = spark.createDataFrame(rowdata).select( 'student id', 'country', 'state') # display pyspark dataframe final.show() Output: +----------+--------+-----+ |student id| country|state| +----------+--------+-----+ | 7058| India| AP| | 7059|Srilanka| X| +----------+--------+-----+ Comment More infoAdvertise with us Next Article Create PySpark dataframe from nested dictionary sravankumar_171fa07058 Follow Improve Article Tags : Python Python-Pyspark Practice Tags : python Similar Reads Create PySpark dataframe from dictionary In this article, we are going to discuss the creation of Pyspark dataframe from the dictionary. To do this spark.createDataFrame() method method is used. This method takes two argument data and columns. The data attribute will contain the dataframe and the columns attribute will contain the list of 2 min read PySpark - Create DataFrame from List In this article, we are going to discuss how to create a Pyspark dataframe from a list. To do this first create a list of data and a list of column names. Then pass this zipped data to spark.createDataFrame() method. This method is used to create DataFrame. The data attribute will be the list of da 2 min read PySpark - Create dictionary from data in two columns In this article, we are going to see how to create a dictionary from data in two columns in PySpark using Python. Method 1: Using Dictionary comprehension Here we will create dataframe with two columns and then convert it into a dictionary using Dictionary comprehension. Python # importing pyspark # 3 min read How to create DataFrame from dictionary in Python-Pandas? The task of converting a dictionary into a Pandas DataFrame involves transforming a dictionary into a structured, tabular format where keys represent column names or row indexes and values represent the corresponding data.Using Default ConstructorThis is the simplest method where a dictionary is dir 3 min read How To Convert Pandas Dataframe To Nested Dictionary In this article, we will learn how to convert Pandas DataFrame to Nested Dictionary. Convert Pandas Dataframe To Nested DictionaryConverting a Pandas DataFrame to a nested dictionary involves organizing the data in a hierarchical structure based on specific columns. In Python's Pandas library, we ca 2 min read Convert PySpark DataFrame to Dictionary in Python In this article, we are going to see how to convert the PySpark data frame to the dictionary, where keys are column names and values are column values. Before starting, we will create a sample Dataframe: Python3 # Importing necessary libraries from pyspark.sql import SparkSession # Create a spark se 3 min read Convert Python Dictionary List to PySpark DataFrame In this article, we will discuss how to convert Python Dictionary List to Pyspark DataFrame. It can be done in these ways: Using Infer schema.Using Explicit schemaUsing SQL Expression Method 1: Infer schema from the dictionary We will pass the dictionary directly to the createDataFrame() method. Syn 3 min read Create Pandas Dataframe from Dictionary of Dictionaries In this article, we will discuss how to create a pandas dataframe from the dictionary of dictionaries in Python. Method 1: Using DataFrame() We can create a dataframe using Pandas.DataFrame() method. Syntax: pandas.DataFrame(dictionary) where pandas are the module that supports DataFrame data struct 2 min read How to convert list of dictionaries into Pyspark DataFrame ? In this article, we are going to discuss the creation of the Pyspark dataframe from the list of dictionaries. We are going to create a dataframe in PySpark using a list of dictionaries with the help createDataFrame() method. The data attribute takes the list of dictionaries and columns attribute tak 2 min read Like