How to Rename Multiple PySpark DataFrame Columns Last Updated : 29 Jun, 2021 Comments Improve Suggest changes Like Article Like Report In this article, we will discuss how to rename the multiple columns in PySpark Dataframe. For this we will use withColumnRenamed() and toDF() functions. Creating Dataframe for demonstration: Python3 # importing module import pyspark # importing sparksession from pyspark.sql module from pyspark.sql import SparkSession # creating sparksession and giving an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # list of students data with null values # we can define null values with none data = [[None, "sravan", "vignan"], ["2", None, "vvit"], ["3", "rohith", None], ["4", "sridevi", "vignan"], ["1", None, None], ["5", "gnanesh", "iit"]] # specify column names columns = ['ID', 'NAME', 'college'] # creating a dataframe from the lists of data dataframe = spark.createDataFrame(data, columns) # show columns print(dataframe.columns) # display dataframe dataframe.show() Output: Method 1: Using withColumnRenamed() This method is used to rename a column in the dataframe Syntax: dataframe.withColumnRenamed("old_column_name", "new_column_name") where dataframe is the pyspark dataframeold_column_name is the existing column namenew_column_name is the new column name To change multiple columns, we can specify the functions for n times, separated by "." operator Syntax: dataframe.withColumnRenamed("old_column_name", "new_column_name"). withColumnRenamed"old_column_name", "new_column_name") Example 1: Python program to change the column name for two columns Python3 # display actual columns print("Actual columns: ", dataframe.columns) # change the college column name to university # and ID to student_id dataframe = dataframe.withColumnRenamed( "college", "university").withColumnRenamed("ID", "student_id") # display modified columns print("modified columns: ", dataframe.columns) # final dataframe dataframe.show() Output: Example 2: Rename all columns Python3 # display actual columns print("Actual columns: ", dataframe.columns) # change the college column name to university # and ID to student_id dataframe = dataframe.withColumnRenamed( "college", "university").withColumnRenamed( "ID", "student_id").withColumnRenamed("NAME", "student_name") # display modified columns print("modified columns: ", dataframe.columns) # final dataframe dataframe.show() Output: Method 2: Using toDF() This method is used to change the names of all the columns of the dataframe Syntax: dataframe.toDF(*("column 1","column 2","column n)) where, columns are the columns in the dataframe Example: Python program to change the column names Python3 # display actual print("Actual columns: ", dataframe.columns) # change column names to A,B,C dataframe = dataframe.toDF(*("A", "B", "C")) # display new columns print("New columns: ", dataframe.columns) # display dataframe dataframe.show() Output: Comment More infoAdvertise with us Next Article How to Rename Multiple PySpark DataFrame Columns gottumukkalabobby Follow Improve Article Tags : Python Python-Pyspark Practice Tags : python Similar Reads How to rename multiple columns in PySpark dataframe ? In this article, we are going to see how to rename multiple columns in PySpark Dataframe. Before starting let's create a dataframe using pyspark: Python3 # importing module import pyspark from pyspark.sql.functions import col # importing sparksession from pyspark.sql module from pyspark.sql import S 2 min read How to Add Multiple Columns in PySpark Dataframes ? In this article, we will see different ways of adding Multiple Columns in PySpark Dataframes. Let's create a sample dataframe for demonstration: Dataset Used: Cricket_data_set_odi Python3 # import pandas to read json file import pandas as pd # importing module import pyspark # importing sparksessio 2 min read How to select and order multiple columns in Pyspark DataFrame ? In this article, we will discuss how to select and order multiple columns from a dataframe using pyspark in Python. For this, we are using sort() and orderBy() functions along with select() function. Methods UsedSelect(): This method is used to select the part of dataframe columns and return a copy 2 min read How to rename columns in Pandas DataFrame In this article, we will see how to rename column in Pandas DataFrame. The simplest way to rename columns in a Pandas DataFrame is to use the rename() function. This method allows renaming specific columns by passing a dictionary, where keys are the old column names and values are the new column nam 4 min read How to rename multiple column headers in a Pandas DataFrame? Here we are going to rename multiple column headers using the rename() method. The rename method is used to rename a single column as well as rename multiple columns at a time. And pass columns that contain the new values and in place = true as an argument. We pass inplace = true because we just mod 5 min read How to Rename Multiple Columns in R Renaming columns in R Programming Language is a basic task when working with data frames, and it's done to make things clearer. Whether you want names to be more understandable, follow certain rules, or match your analysis, there are different ways to change column names. There are types of methods 4 min read How to delete columns in PySpark dataframe ? In this article, we are going to delete columns in Pyspark dataframe. To do this we will be using the drop() function. This function can be used to remove values from the dataframe. Syntax: dataframe.drop('column name') Python code to create student dataframe with three columns: Python3 # importing 2 min read Split single column into multiple columns in PySpark DataFrame pyspark.sql.functions provide a function split() which is used to split DataFrame string Column into multiple columns.  Syntax: pyspark.sql.functions.split(str, pattern, limit=- 1) Parameters: str: str is a Column or str to split.pattern: It is a str parameter, a string that represents a regular ex 4 min read Select columns in PySpark dataframe In this article, we will learn how to select columns in PySpark dataframe. Function used: In PySpark we can select columns using the select() function. The select() function allows us to select single or multiple columns in different formats. Syntax: dataframe_name.select( columns_names ) Note: We 4 min read PySpark - Select Columns From DataFrame In this article, we will discuss how to select columns from the pyspark dataframe. To do this we will use the select() function. Syntax: dataframe.select(parameter).show() where, dataframe is the dataframe nameparameter is the column(s) to be selectedshow() function is used to display the selected 2 min read Like