Skip to Content

Pandas Dataframe insert() function (Examples)

When manipulating data in dataframes, it can be handy to know how to insert a column at a specific position in the dataframe. In this tutorial, we will look at how to use the pandas dataframe insert() function with the help of some examples.

insert column in pandas dataframe

How to use the pandas dataframe insert() function?

The pandas dataframe insert() function is used to insert a column as a specific position in a pandas dataframe. The following is the syntax –

df.insert(loc, column, value, allow_duplicates=False)

The pandas dataframe insert() function takes the following arguments –

  • loc – (int) The index where the new column is to be inserted. The index must be in the range, 0 <= loc <= len(columns).
  • column – (str, num, or hashable object) The label (column name) for the inserted column.
  • value – (scaler, series, or array-like) The column values.
  • allow_duplicates – (bool) Optional argument. Determines whether you can have duplicate columns or not. It is False by default.

Note that the insert() function modifies the dataframe in place.

Let’s now look at some examples showing the usage of the above function for different use cases.

Let’s say you have a dataframe containing the name and age data of some employees in an office.

import pandas as pd

# employee data
data = {
    "Name": ["Jim", "Dwight", "Angela", "Tobi"],
    "Age": [26, 28, 27, 32]
}

# create pandas dataframe
df = pd.DataFrame(data)

# display the dataframe
df

Output:

employee dataframe with "Name" and "Age" columns

The dataframe df has columns “Name” and “Age”.

Example 1 – Insert the New Column at the end of the dataframe

You want to add a new column containing the employee department information at the end of the above dataframe.

Columns in a pandas dataframe are indexed from 0 to n-1 where n is the number of columns in the dataframe. To insert a column at the end of the dataframe, pass n as the insertion index.

# insert new column at the end of the dataframe
df.insert(len(df.columns), "Department", ["Sales", "Sales", "Accounting", "HR"])
# display the dataframe
df

Output:

resulting dataframe after inserting "Department" column at the end

You can see that the resulting dataframe has the new column inserted at the end. Here, we passed the len(df.columns) as the index to insert the new column.

Example 2- Insert the New Column at the beginning of the dataframe

To insert a new column at the beginning of a dataframe using the insert() function, pass 0 as the loc argument (the index to insert the new column).

Let’s insert the “Department” column at the beginning of our original dataframe.

# employee data
data = {
    "Name": ["Jim", "Dwight", "Angela", "Tobi"],
    "Age": [26, 28, 27, 32]
}

# create pandas dataframe
df = pd.DataFrame(data)

# insert column at the beginning of the dataframe
df.insert(0, "Department", ["Sales", "Sales", "Accounting", "HR"])

# display the dataframe
df

Output:

resulting dataframe after inserting "Department" column at the beginning

The new column, “Department” is added at the beginning of the dataframe.

Example 3 – Insert the New Column at a specific position in the dataframe

This is the general case where we specify the column index at which we want to insert the new column. The above two examples were special cases of this general case.

Let’s insert the “Department” column at index 1 (that is, as the second column) in the original dataframe.

# employee data
data = {
    "Name": ["Jim", "Dwight", "Angela", "Tobi"],
    "Age": [26, 28, 27, 32]
}

# create pandas dataframe
df = pd.DataFrame(data)

# insert column at index 1
df.insert(1, "Department", ["Sales", "Sales", "Accounting", "HR"])

# display the dataframe
df

Output:

"Department" column inserted at index 1

Here, the new “Department” column is inserted at index 1.

Example 4 – Insert Duplicate Column in the dataframe

If you try to insert a duplicate column (a column that’s already contained in the dataframe), the insert() function will give a ValueError by default.

To insert a duplicate column using the pandas dataframe insert() function, pass allow_duplicates=True.

# employee data
data = {
    "Name": ["Jim", "Dwight", "Angela", "Tobi"],
    "Age": [26, 28, 27, 32],
    "Department": ["Sales", "Sales", "Accounting", "HR"]
}

# create pandas dataframe
df = pd.DataFrame(data)

# insert duplicate column
df.insert(len(df.columns), "Department", ["Sales", "Sales", "Accounting", "HR"], allow_duplicates=True)

# display the dataframe
df

Output:

resulting dataframe after inserting a duplicate "Department" column at the end

Now, the duplicate column is inserted into the dataframe.

Summary – Insert Column with Pandas dataframe insert() function

In this tutorial, we looked at the usage of the pandas dataframe insert() function. The following are the key takeaways from this tutorial –

  • The pandas dataframe insert() function is used to insert a column as a specific position in the dataframe.
  • The insert() function modifies the dataframe in place.
  • By default, the insert() function does not allow you to insert duplicate columns in the dataframe. To insert a duplicate column, pass allow_duplicates=True.

Note that there are other methods as well that allow you to add columns to a pandas dataframe but they generally add the new column at the end of the dataframe whereas the insert() function lets you specify the index to insert the new column.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush

    Piyush is a data scientist passionate about using data to understand things better and make informed decisions. In the past, he's worked as a Data Scientist for ZS and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.