Skip to Content

Get List of Categories in Pandas Category Column

In this tutorial, we will look at how to get the list of categories from a Pandas category column with the help of some examples.

How to get all possible category values in a category type column in Pandas?

Categorical data in Pandas has a categories and an ordered property. The categories property stores the list of possible values for the categorical data.

You can use the .cat accessor to get the categories property of a category type column in Pandas. The following is the syntax –

# get all categories of a category type column
df["Col"].cat.categories

It returns the list of possible category values in the column.

Examples

Let’s look at some examples of using the above method to get the list of categories in a category type column in Pandas. First, we’ll create a dataframe that we will be using throughout this tutorial.

import pandas as pd

# create a dataframe
df = pd.DataFrame({
        "Name": ["Tim", "Sarah", "Hasan", "Jyoti", "Jack"],
        "Gender": ["Male", "Female", "Male", "Female", "Male"]
})
# change to category dtype
df["Gender"] = df["Gender"].astype("category")
# display the dataframe
print(df)

Output:

    Name  Gender
0    Tim    Male
1  Sarah  Female
2  Hasan    Male
3  Jyoti  Female
4   Jack    Male

We now have a dataframe containing the names and the respective gender of some students in a university.

The “Gender” column is of category type.

# display the Gender column
print(df["Gender"])

Output:

0      Male
1    Female
2      Male
3    Female
4      Male
Name: Gender, dtype: category
Categories (2, object): ['Female', 'Male']

Let’s now get all the possible categories for this column using the syntax mentioned above.

# get all categories
print(df["Gender"].cat.categories)

Output:

Index(['Female', 'Male'], dtype='object')

You can see that we get all the category values for the “Gender” column.

Let’s look at another example.

Let’s add an additional column to our dataframe to store the shirt size of the students.

# add column to store shirt size
df["Shirt Size"] = ["M", "S", "M", "M", "S"]
# change type to category
df["Shirt Size"] = df["Shirt Size"].astype("category")
# set and order categories for the shirt size column
df["Shirt Size"] = df["Shirt Size"].cat.set_categories(["S", "M", "L"], ordered=True)
# display the column
print(df["Shirt Size"])

Output:

0    M
1    S
2    M
3    M
4    S
Name: Shirt Size, dtype: category
Categories (3, object): ['S' < 'M' < 'L']

Note that the “Shirt Size” contains categorical values that are ordered. Let’s print out the possible category values for this column.

# get all categories
print(df["Shirt Size"].cat.categories)

Output:

Index(['S', 'M', 'L'], dtype='object')

You can see that we get “S”, “M”, and “L” as the possible values for the “Shirt Size” column. Note that the size “L” does not appear in our data but since it’s a possible value the resulting list includes it.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush is a data scientist passionate about using data to understand things better and make informed decisions. In the past, he's worked as a Data Scientist for ZS and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.