Skip to Content

Pandas – Set Category Order of a Categorical Column

In this tutorial, we will look at how to set a category order in a Pandas category type column with the help of some examples.

How to set category order in Pandas?

set category order in a pandas categorical column

You can use the Pandas categorical set_categories() function to set and order categories in a category type column. Use the .cat accessor to apply this function on a Pandas column. The following is the syntax –

# set and order categories
df["Col"] = df["Col"].cat.set_categories(category_order_list, ordered=True)

Pass the categories in the order you want as a list and ordered=True as arguments to make the column an ordered categorical column with the given category order.

Examples

Let’s look at some examples of setting the category order for a category type column in Pandas. First, we will create a sample dataframe that we will be using throughout this tutorial.

import pandas as pd

# create a dataframe
df = pd.DataFrame({
        "Name": ["Tim", "Sarah", "Hasan", "Jyoti", "Jack"],
        "Year": ["Junior", "Senior", "Freshman", "Junior", "Freshman"],
        "Shirt Size": ["S", "M", "L", "S", "L"]
})
# change to category dtype
df["Year"] = df["Year"].astype("category")
df["Shirt Size"] = df["Shirt Size"].astype("category")
# display the dataframe
print(df)

Output:

    Name      Year Shirt Size
0    Tim    Junior          S
1  Sarah    Senior          M
2  Hasan  Freshman          L
3  Jyoti    Junior          S
4   Jack  Freshman          L

We now have a dataframe containing the name, year, and the respective t-shirt size of some students in a university. Note that the “Year” and the “Shirt Size” column is of category type.

Let’s print out the “Year” column.

# display the column
print(df["Year"])

Output:

0      Junior
1      Senior
2    Freshman
3      Junior
4    Freshman
Name: Year, dtype: category
Categories (3, object): ['Freshman', 'Junior', 'Senior']

You can see that the “Year” column is of category dtype. Note that the category values in this column are not ordered. That is, for example, the information that “Senior” is greater than “Junior” is not encoded in the values.

Set Category Order of a Category type column in Pandas

You can convert an unordered categorical type column to an ordered categorical column. Let’s convert the “Year” column to an ordered category column with category order “Freshman” < “Sophomore” < “Junior” < “Senior”. For this, we will use the Pandas categorical set_categories() function.

# set and order categories
df["Year"] = df["Year"].cat.set_categories(["Freshman", "Sophomore", "Junior", "Senior"], ordered=True)
# display the column
print(df["Year"])

Output:

0      Junior
1      Senior
2    Freshman
3      Junior
4    Freshman
Name: Year, dtype: category
Categories (4, object): ['Freshman' < 'Sophomore' < 'Junior' < 'Senior']

You can see that the categories are now ordered. Note that some category values are not present in the data but the order information is still encoded in the category field.

Let’s look at another example. First, let’s print out the “Shirt Size” column.

# display the column
print(df["Shirt Size"])

Output:

0    S
1    M
2    L
3    S
4    L
Name: Shirt Size, dtype: category
Categories (3, object): ['L', 'M', 'S']

You can see that this column is also of catgory type but is currently unordered. Let’s set the category order for the “Shirt Size” column to “S” < “M” < “L”.

# set and order categories
df["Shirt Size"] = df["Shirt Size"].cat.set_categories(["S", "M", "L"], ordered=True)
# display the column
print(df["Shirt Size"])

Output:

0    S
1    M
2    L
3    S
4    L
Name: Shirt Size, dtype: category
Categories (3, object): ['S' < 'M' < 'L']

The categories in the “Shirt Size” column are now ordered.

If, on the other hand, you want to change the order or categories in an ordered categorical column, use the Pandas categorical reorder_categories() function.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush

    Piyush is a data scientist passionate about using data to understand things better and make informed decisions. In the past, he's worked as a Data Scientist for ZS and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.