In this tutorial, we will look at how to add a column to a pandas dataframe from another column with the help of some examples.
How to add a new column from another dataframe in Pandas?
Let’s say your dataframe, df1 has columns “A” and “B” and there is another dataframe, df2 having columns “C”, “D”, and “E”. You want to add column “C” to the dataframe df1 (assuming both the dataframes have the same length).
Add a new column from another dataframe in Pandas
To add a column from another pandas dataframe, create a new column in the original dataframe and set it to the values of the column in the other dataframe.
The following is the syntax –
# add column "C" to df1 from df2 df1["C"] = df2["C"]
This will add column “C” to the end of the dataframe df1. If you want to add the new column at a specific column position, use the pandas dataframe insert()
function.
Note that here we are simply creating a new column and setting its values to the values from a column in another dataframe. If you want to add a column based on a matching key (performing a join operation) use the pandas join()
function instead.
Examples
Let’s now look at some examples of using the above syntax.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
First, we will create some dataframes that we will use throughout this tutorial.
import pandas as pd # employee name and age data data1 = { "Name": ["Jim", "Dwight", "Angela", "Tobi"], "Age": [26, 28, 27, 32] } # employee department and salary data data2 = { "Department": ["Sales", "Sales", "Accounting", "HR"], "Salary": [55000, 60000, 52000, 45000] } # create pandas dataframe df1 = pd.DataFrame(data1) # create pandas dataframe df2 = pd.DataFrame(data2) # display the dataframe print(df1) print("--------------") print(df2)
Output:
Name Age 0 Jim 26 1 Dwight 28 2 Angela 27 3 Tobi 32 -------------- Department Salary 0 Sales 55000 1 Sales 60000 2 Accounting 52000 3 HR 45000
Here, we create two dataframes – df1
and df2
. The dataframe df1
contains the “Name” and “Age” information of some employees in an office and the dataframe df2
contains the “Department” and “Salary” information.
Example 1 – Add column from another dataframe using the assignment operator =
Let’s add the “Department” to the dataframe df1
from the dataframe df2
.
# add "Department" column to df1 from df2 df1["Department"] = df2["Department"] # display df1 df1
Output:
The dataframe df1
now has the “Department” column. Note that both the dataframes df1
and df2
have the same lengths and we’re assuming that the values at the same row indices correspond to the same entity (in this example, the same employee).
Example 2 – Add column from another dataframe using insert()
function
In the above example, the new column was added at the end of the dataframe. If you want to add the new column at a specific column position, use the pandas dataframe insert()
function.
The following is the syntax –
df.insert(loc, column, value, allow_duplicates=False)
We pass the insertion index, column name, and column values to the insert()
function. Note that this function modifies the dataframe in place.
Let’s add the “Department” column to the original df1
dataframe at index 1.
# reset df1 and df2 to their original values df1 = pd.DataFrame(data1) df2 = pd.DataFrame(data2) # add "Department" column to df1 (from df2) at index 1 df1.insert(1, "Department", df2["Department"]) # display df1 df1
Output:
You can see that the “Department” column from the dataframe df2
is added to the dataframe df1
at index 1.
Summary
In this tutorial, we looked at how to add a new column to a dataframe from another dataframe. The following are the key takeaways –
- The idea is to create a new column and set its values to the values of the column in another dataframe (assuming both the dataframes have the same length).
- To add the new column at a specific column position, use the pandas dataframe
insert()
function instead.
You might also be interested in –
- Pandas – Add Column to Existing DataFrame
- Pandas – Add an Empty Column to a DataFrame
- Pandas – Join vs Merge
- Understanding Joins in Pandas
- Pandas – Merge DataFrames on Multiple Columns
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.