PySpark - Create DataFrame from List Last Updated : 30 May, 2021 Comments Improve Suggest changes Like Article Like Report In this article, we are going to discuss how to create a Pyspark dataframe from a list. To do this first create a list of data and a list of column names. Then pass this zipped data to spark.createDataFrame() method. This method is used to create DataFrame. The data attribute will be the list of data and the columns attribute will be the list of names. dataframe = spark.createDataFrame(data, columns) Example1: Python code to create Pyspark student dataframe from two lists. Python3 # importing module import pyspark # importing sparksession from # pyspark.sql module from pyspark.sql import SparkSession # creating sparksession and giving # an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # list of college data with two lists data = [["java", "dbms", "python"], ["OOPS", "SQL", "Machine Learning"]] # giving column names of dataframe columns = ["Subject 1", "Subject 2", "Subject 3"] # creating a dataframe dataframe = spark.createDataFrame(data, columns) # show data frame dataframe.show() Output: Example 2: Create a dataframe from 4 lists Python3 # importing module import pyspark # importing sparksession from # pyspark.sql module from pyspark.sql import SparkSession # creating sparksession and giving # an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # list of college data with two lists data = [["node.js", "dbms", "integration"], ["jsp", "SQL", "trigonometry"], ["php", "oracle", "statistics"], [".net", "db2", "Machine Learning"]] # giving column names of dataframe columns = ["Web Technologies", "Data bases", "Maths"] # creating a dataframe dataframe = spark.createDataFrame(data, columns) # show data frame dataframe.show() Output: Comment More infoAdvertise with us Next Article PySpark - Create DataFrame from List sravankumar_171fa07058 Follow Improve Article Tags : Python Python-Pyspark Practice Tags : python Similar Reads Create PySpark DataFrame from list of tuples In this article, we are going to discuss the creation of a Pyspark dataframe from a list of tuples. To do this, we will use the createDataFrame() method from pyspark. This method creates a dataframe from RDD, list or Pandas Dataframe. Here data will be the list of tuples and columns will be a list 2 min read Create PySpark dataframe from nested dictionary In this article, we are going to discuss the creation of Pyspark dataframe from the nested dictionary. We will use the createDataFrame() method from pyspark for creating DataFrame. For this, we will use a list of nested dictionary and extract the pair as a key and value. Select the key, value pairs 2 min read How to create a PySpark dataframe from multiple lists ? In this article, we will discuss how to create Pyspark dataframe from multiple lists. ApproachCreate data from multiple lists and give column names in another list. So, to do our task we will use the zip method. zip(list1,list2,., list n) Pass this zipped data to spark.createDataFrame() method data 2 min read Create PySpark dataframe from dictionary In this article, we are going to discuss the creation of Pyspark dataframe from the dictionary. To do this spark.createDataFrame() method method is used. This method takes two argument data and columns. The data attribute will contain the dataframe and the columns attribute will contain the list of 2 min read Creating a PySpark DataFrame PySpark helps in processing large datasets using its DataFrame structure. In this article, we will see different methods to create a PySpark DataFrame. It starts with initialization of SparkSession which serves as the entry point for all PySpark applications which is shown below:from pyspark.sql imp 5 min read Like