Skip to Content

Pandas – Set Column as Index (With Examples)

In this tutorial, we will look at how to set a column in a pandas dataframe as the index of the dataframe with the help of some examples.

How to set a column as the index in a pandas dataframe?

You can use the pandas dataframe set_index() function to set a column as the index of a pandas dataframe. Pass the column name as an argument.

The following is the syntax –

# set a column as dataframe's index
df.set_index("Col")

# set combination of columns as dataframe's index
df.set_index(["Col1", "Col2"])

It returns the updated dataframe (with the column set as the index). You can also pass inplace=True to modify the dataframe in place, in which case, the set_index() function changes the original dataframe and does not return any value.

You can also use a combination of columns as the index for the dataframe, in that case, pass the list of columns as the argument.

Examples

Let’s now look at some examples of using the above syntax –

First, we will create a dataframe that we will use throughout this tutorial.

import pandas as pd

# employee data
data = {
    "Name": ["Jim", "Dwight", "Angela", "Tobi"],
    "Age": [26, 28, 27, 32],
    "Department": ["Sales", "Sales", "Accounting", "HR"],
    "Salary": [55000, 60000, 52000, 45000]
}

# create pandas dataframe
df = pd.DataFrame(data)

# display the dataframe
df

Output:

employees dataframe with four columns

Here, we created a dataframe with information about some employees in an office. You can see that the dataframe has the columns – “Name”, “Age”, “Department”, and “Salary”.

Example 1 – Set a column as the dataframe index

The dataframe created above has a default index, let’s modify the dataframe such that it uses the “Name” column as the index in the dataframe.

# set Name column as index
df.set_index("Name", inplace=True)
# display the dataframe
df

Output:

employees dataframe with the "Name" column set as index

You can see that now the dataframe’s index is the values from the “Name” column.

Let’s print out the dataframe’s index.

# display the dataframe index
df.index

Output:

Index(['Jim', 'Dwight', 'Angela', 'Tobi'], dtype='object', name='Name')

The dataframe’s index is the values from the “Name” column in the original dataframe.

You can use the pandas dataframe reset_index() function to reset the index of the dataframe back to its default index.

# reset the index
df.reset_index(inplace=True)
# display the dataframe
df

Output:

employees dataframe with the default index

Here, we passed inplace=True to modify the dataframe in place.

Example 2 – Set multiple columns as dataframe index

You can also set the index of a dataframe as a combination of multiple columns.

For example, let’s use the combination of the “Name” and the “Age” columns as the index for the above dataframe.

# set combination of Name and Age columns as index
df.set_index(["Name", "Age"], inplace=True)
# display the dataframe
df

Output:

employees dataframe with "Name" and "Age" columns set as index

You can see that now the dataframe index is a combination of the “Name” and “Age” column values.

If you print out the dataframe index you’ll see that the dataframe index is now of MultiIndex type.

# display the dataframe index
df.index

Output:

MultiIndex([(   'Jim', 26),
            ('Dwight', 28),
            ('Angela', 27),
            (  'Tobi', 32)],
           names=['Name', 'Age'])

Summary

In this tutorial, we looked at how to set a column in a pandas dataframe as its index. The following are the key takeaways –

  • Use the pandas dataframe set_index() function to set a column (or combination of columns) as the index of the dataframe.
  • If you set a combination of columns as the index, the dataframe index will be of MultiIndex type.
  • Use the pandas dataframe reset_index() function to reset the index of the dataframe to its default.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush

    Piyush is a data scientist passionate about using data to understand things better and make informed decisions. In the past, he's worked as a Data Scientist for ZS and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.