Skip to Content

Pandas – Set Column as Index (With Examples)

In this tutorial, we will look at how to set a column in a pandas dataframe as the index of the dataframe with the help of some examples.

How to set a column as the index in a pandas dataframe?

You can use the pandas dataframe set_index() function to set a column as the index of a pandas dataframe. Pass the column name as an argument.

📚 Discover Online Data Science Courses & Programs (Enroll for Free)

Introductory ⭐

Intermediate ⭐⭐⭐

Advanced ⭐⭐⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

The following is the syntax –

# set a column as dataframe's index
df.set_index("Col")

# set combination of columns as dataframe's index
df.set_index(["Col1", "Col2"])

It returns the updated dataframe (with the column set as the index). You can also pass inplace=True to modify the dataframe in place, in which case, the set_index() function changes the original dataframe and does not return any value.

You can also use a combination of columns as the index for the dataframe, in that case, pass the list of columns as the argument.


Upskill your career right now →

Examples

Let’s now look at some examples of using the above syntax –

First, we will create a dataframe that we will use throughout this tutorial.

import pandas as pd

# employee data
data = {
    "Name": ["Jim", "Dwight", "Angela", "Tobi"],
    "Age": [26, 28, 27, 32],
    "Department": ["Sales", "Sales", "Accounting", "HR"],
    "Salary": [55000, 60000, 52000, 45000]
}

# create pandas dataframe
df = pd.DataFrame(data)

# display the dataframe
df

Output:

employees dataframe with four columns

Here, we created a dataframe with information about some employees in an office. You can see that the dataframe has the columns – “Name”, “Age”, “Department”, and “Salary”.

Example 1 – Set a column as the dataframe index

The dataframe created above has a default index, let’s modify the dataframe such that it uses the “Name” column as the index in the dataframe.

# set Name column as index
df.set_index("Name", inplace=True)
# display the dataframe
df

Output:

employees dataframe with the "Name" column set as index

You can see that now the dataframe’s index is the values from the “Name” column.


Upskill your career right now →

Let’s print out the dataframe’s index.

# display the dataframe index
df.index

Output:

Index(['Jim', 'Dwight', 'Angela', 'Tobi'], dtype='object', name='Name')

The dataframe’s index is the values from the “Name” column in the original dataframe.

You can use the pandas dataframe reset_index() function to reset the index of the dataframe back to its default index.

# reset the index
df.reset_index(inplace=True)
# display the dataframe
df

Output:

employees dataframe with the default index

Here, we passed inplace=True to modify the dataframe in place.

Example 2 – Set multiple columns as dataframe index

You can also set the index of a dataframe as a combination of multiple columns.

For example, let’s use the combination of the “Name” and the “Age” columns as the index for the above dataframe.

# set combination of Name and Age columns as index
df.set_index(["Name", "Age"], inplace=True)
# display the dataframe
df

Output:

employees dataframe with "Name" and "Age" columns set as index

You can see that now the dataframe index is a combination of the “Name” and “Age” column values.

If you print out the dataframe index you’ll see that the dataframe index is now of MultiIndex type.

# display the dataframe index
df.index

Output:

MultiIndex([(   'Jim', 26),
            ('Dwight', 28),
            ('Angela', 27),
            (  'Tobi', 32)],
           names=['Name', 'Age'])

Summary

In this tutorial, we looked at how to set a column in a pandas dataframe as its index. The following are the key takeaways –

  • Use the pandas dataframe set_index() function to set a column (or combination of columns) as the index of the dataframe.
  • If you set a combination of columns as the index, the dataframe index will be of MultiIndex type.
  • Use the pandas dataframe reset_index() function to reset the index of the dataframe to its default.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.