rename pyspark dataframe column name

Rename DataFrame Column Name in Pyspark

In this tutorial, we will look at how to rename a column in a Pyspark dataframe with the help of some examples.

How to rename a Pyspark dataframe column?

rename pyspark dataframe column name

You can use the Pyspark withColumnRenamed() function to rename a column in a Pyspark dataframe. It takes the old column name and the new column name as arguments. The following is the syntax.

DataFrame.withColumnRenamed(old_column_name, new_column_name)

It returns a Pyspark dataframe with the column renamed.

Examples

Let’s look at some examples of using the above function to rename one or more column names. First, we’ll create a Pyspark dataframe that we will be using throughout this tutorial.

#import the pyspark module
import pyspark
  
# import the  sparksession class  from pyspark.sql
from pyspark.sql import SparkSession

# create an app from SparkSession class
spark = SparkSession.builder.appName('datascience_parichay').getOrCreate()

# books data as list of lists
df = [[1, "PHP", "Sravan", 250],
        [2, "SQL", "Chandra", 300],
        [3, "Python", "Harsha", 250],
        [4, "R", "Rohith", 1200],
        [5, "Hadoop", "Manasa", 700],
        ]
  
# creating dataframe from books data
dataframe = spark.createDataFrame(df, ['Book_Id', 'Book_Name', 'Author', 'Price'])

# display the dataframe
dataframe.show()

Output:

+-------+---------+-------+-----+
|Book_Id|Book_Name| Author|Price|
+-------+---------+-------+-----+
|      1|      PHP| Sravan|  250|
|      2|      SQL|Chandra|  300|
|      3|   Python| Harsha|  250|
|      4|        R| Rohith| 1200|
|      5|   Hadoop| Manasa|  700|
+-------+---------+-------+-----+

We have a dataframe with 5 rows and 4 columns containing information on some books like the book name, author, price, etc.

Rename a column name in Pyspark

Let’s rename the column “Author” to the name “Writer”. For this, we pass the old column name, “Author” and the new column name, “Writer” as arguments to the withColumnRenamed() function.

# change column name from Author to Writer 
dataframe.withColumnRenamed("Author", "Writer").show()

Output:

📚 Data Science Programs By Skill Level

Introductory

Intermediate ⭐⭐⭐

Advanced ⭐⭐⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

+-------+---------+-------+-----+
|Book_Id|Book_Name| Writer|Price|
+-------+---------+-------+-----+
|      1|      PHP| Sravan|  250|
|      2|      SQL|Chandra|  300|
|      3|   Python| Harsha|  250|
|      4|        R| Rohith| 1200|
|      5|   Hadoop| Manasa|  700|
+-------+---------+-------+-----+

You can see that the “Author” column was renamed to “Writer”.

Rename multiple columns in Pyspark

To rename multiple columns, you can chain multiple calls to the withColumnRenamed() function. For example, let’s rename the “Book_Id” column to “Id” and the “Book_Name” column to “Name”.

# change column names - Book_Id to Id and Book_Name to Name
dataframe.withColumnRenamed("Book_Id", "Id").withColumnRenamed("Book_Name", "Name").show()

Output:

+---+------+-------+-----+
| Id|  Name| Author|Price|
+---+------+-------+-----+
|  1|   PHP| Sravan|  250|
|  2|   SQL|Chandra|  300|
|  3|Python| Harsha|  250|
|  4|     R| Rohith| 1200|
|  5|Hadoop| Manasa|  700|
+---+------+-------+-----+

Both the columns were renamed.

In this tutorial, we looked at how to use the withColumnRenamed() function to change column names in a Pyspark dataframe.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Authors

  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

  • Gottumukkala Sravan Kumar
Scroll to Top