Pandas - Rename Columns After Merge - Data Science Parichay

In this tutorial, we will look at how to rename the columns of a dataframe resulting from the merge of two dataframes.

rename column names after merge in pandas

When you’re merging two dataframes that have columns with the same names, pandas gives them default suffixes like _x, or _y to keep the names of the columns different in the resulting dataframe after the merge.

Let’s look at an example.

import pandas as pd

# Create the first DataFrame
df1 = pd.DataFrame({'Employee': ['John Smith', 'Jane Doe', 'Bob Johnson'],
                    'Job Title': ['Data Scientist', 'Product Manager', 'CEO'],
                    'Years of Experience': [7, 6, 15]})

# Create the second DataFrame
df2 = pd.DataFrame({'Employee': ['John Smith', 'Jane Doe', 'Bob Johnson', 'Alice Williams'],
                    'Job Title': ['Accountant', 'Marketing Manager', 'CFO', 'HR Manager'],
                    'Years of Experience': [2, 1, 10, 6]})

Here, we created two dataframes, df1 and df2. Now, both the dataframes store information about some employees in an office, the only difference is that df1 stores the current information whereas the data in df2 is from five years back.

Let’s merge the two dataframes on the “Employee” column and see what we get by default.

# merge df1 and df2 in a left join
df = df1.merge(df2, on="Employee", how="left")
# display the resulting dataframe
df

Output:

the resulting dataframe after the merge operation

You can see that since the merged dataframe had column names with the same name, the merge() function gave them suffixes to identify which column came from which dataframe. The columns with _x suffix came from the left dataframe and the columns with the _y suffix came from the right dataframe (in the merge).

How to change columns names after the merge?

There are two ways to take this up –

📚 Data Science Programs By Skill Level

Introductory ⭐

Intermediate ⭐⭐⭐

Advanced ⭐⭐⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

Use the pandas dataframe rename() function to change the column names of specific columns in the merged dataframe.
Pass a dictionary of {old_col_name: new_col_name} as an argument to the columns parameter of the rename() function. You can also assign the dataframe new column names by using a list, for example, df.columns = new_col_names_ls.
Alternatively, you can specify the suffix in the merge() function itself to distinguish between the columns using the suffixes parameter. Pass a tuple, for example, (left_suffix, right_suffix) to the suffixes parameter.

Let’s now look at both methods.

Change column names after the merge

# merge df1 and df2 in a left join
df = df1.merge(df2, on="Employee", how="left")
# specify new column names
df.columns = ["Employee", "Job Title_Current", "Years of Experience_Current", "Job Title_Past", "Years of Experience_Past"]
# display the dataframe
df

Output:

column name changed of the merged dataframe

Specify the suffixes in the `merge()` function itself

Let’s specify the suffixes to be used by the common column names from the left and the right dataframes using the suffixes parameter.

# merge df1 and df2 in a left join with custom suffixes
df = df1.merge(df2, on="Employee", how="left", suffixes=("_Current", "_Past"))
# display the dataframe
df

Output:

columns with custom suffixes after the merge operation

You can see that the resulting dataframe has the passed suffixes.

Author

Piyush Raj

Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

View all posts

How to change columns names after the merge?

Change column names after the merge

Specify the suffixes in the merge() function itself

Author

Specify the suffixes in the `merge()` function itself