When manipulating data in dataframes, it can be handy to know how to insert a column at a specific position in the dataframe. In this tutorial, we will look at how to use the pandas dataframe insert()
function with the help of some examples.
How to use the pandas dataframe insert()
function?
The pandas dataframe insert()
function is used to insert a column as a specific position in a pandas dataframe. The following is the syntax –
df.insert(loc, column, value, allow_duplicates=False)
The pandas dataframe insert()
function takes the following arguments –
- loc – (int) The index where the new column is to be inserted. The index must be in the range,
0 <= loc <= len(columns)
. - column – (str, num, or hashable object) The label (column name) for the inserted column.
- value – (scaler, series, or array-like) The column values.
- allow_duplicates – (bool) Optional argument. Determines whether you can have duplicate columns or not. It is
False
by default.
Note that the insert()
function modifies the dataframe in place.
Let’s now look at some examples showing the usage of the above function for different use cases.
Let’s say you have a dataframe containing the name and age data of some employees in an office.
import pandas as pd # employee data data = { "Name": ["Jim", "Dwight", "Angela", "Tobi"], "Age": [26, 28, 27, 32] } # create pandas dataframe df = pd.DataFrame(data) # display the dataframe df
Output:
The dataframe df
has columns “Name” and “Age”.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
Example 1 – Insert the New Column at the end of the dataframe
You want to add a new column containing the employee department information at the end of the above dataframe.
Columns in a pandas dataframe are indexed from 0 to n-1 where n is the number of columns in the dataframe. To insert a column at the end of the dataframe, pass n as the insertion index.
# insert new column at the end of the dataframe df.insert(len(df.columns), "Department", ["Sales", "Sales", "Accounting", "HR"]) # display the dataframe df
Output:
You can see that the resulting dataframe has the new column inserted at the end. Here, we passed the len(df.columns)
as the index to insert the new column.
Example 2- Insert the New Column at the beginning of the dataframe
To insert a new column at the beginning of a dataframe using the insert()
function, pass 0 as the loc
argument (the index to insert the new column).
Let’s insert the “Department” column at the beginning of our original dataframe.
# employee data data = { "Name": ["Jim", "Dwight", "Angela", "Tobi"], "Age": [26, 28, 27, 32] } # create pandas dataframe df = pd.DataFrame(data) # insert column at the beginning of the dataframe df.insert(0, "Department", ["Sales", "Sales", "Accounting", "HR"]) # display the dataframe df
Output:
The new column, “Department” is added at the beginning of the dataframe.
Example 3 – Insert the New Column at a specific position in the dataframe
This is the general case where we specify the column index at which we want to insert the new column. The above two examples were special cases of this general case.
Let’s insert the “Department” column at index 1 (that is, as the second column) in the original dataframe.
# employee data data = { "Name": ["Jim", "Dwight", "Angela", "Tobi"], "Age": [26, 28, 27, 32] } # create pandas dataframe df = pd.DataFrame(data) # insert column at index 1 df.insert(1, "Department", ["Sales", "Sales", "Accounting", "HR"]) # display the dataframe df
Output:
Here, the new “Department” column is inserted at index 1.
Example 4 – Insert Duplicate Column in the dataframe
If you try to insert a duplicate column (a column that’s already contained in the dataframe), the insert()
function will give a ValueError
by default.
To insert a duplicate column using the pandas dataframe insert()
function, pass allow_duplicates=True
.
# employee data data = { "Name": ["Jim", "Dwight", "Angela", "Tobi"], "Age": [26, 28, 27, 32], "Department": ["Sales", "Sales", "Accounting", "HR"] } # create pandas dataframe df = pd.DataFrame(data) # insert duplicate column df.insert(len(df.columns), "Department", ["Sales", "Sales", "Accounting", "HR"], allow_duplicates=True) # display the dataframe df
Output:
Now, the duplicate column is inserted into the dataframe.
Summary – Insert Column with Pandas dataframe insert() function
In this tutorial, we looked at the usage of the pandas dataframe insert()
function. The following are the key takeaways from this tutorial –
- The pandas dataframe
insert()
function is used to insert a column as a specific position in the dataframe. - The
insert()
function modifies the dataframe in place. - By default, the
insert()
function does not allow you to insert duplicate columns in the dataframe. To insert a duplicate column, passallow_duplicates=True
.
Note that there are other methods as well that allow you to add columns to a pandas dataframe but they generally add the new column at the end of the dataframe whereas the insert()
function lets you specify the index to insert the new column.
You might also be interested in –
- Pandas – Split Column by Delimiter
- Pandas – Search for String in DataFrame Column
- Pandas – Get All Unique Values in a Column
- Append Rows to a Pandas DataFrame
- Concat DataFrames in Pandas
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.