In this tutorial, we will look at how to remove categories from a category type column in Pandas with the help of some examples.
How to remove categories from a Pandas Categorical Column?

You can use the Pandas remove_categories()
method to remove categories from a categorical field in Pandas. For a Pandas series, use the .cat
accessor to apply this function.
The following is the syntax –
# remove a category value from a category type column in Pandas df["Col"] = df["Col"].cat.remove_categories("category_value_to_remove")
Pass the category or a list of categories (if removing multiple categories) as an argument to the function. The passed categories are removed from the list of possible category values for that field.
Examples
Let’s look at some examples of removing categories from a categorical field. First, we’ll create a Pandas dataframe that we will be using throughout this tutorial.
import pandas as pd # create a dataframe df = pd.DataFrame({ "Name": ["Tim", "Sarah", "Hasan", "Jyoti", "Jack"], "Shirt Size": ["S", "M", "XL", "S", "L"] }) # change to category dtype df["Shirt Size"] = df["Shirt Size"].astype("category") # display the dataframe print(df)
Output:
Name Shirt Size 0 Tim S 1 Sarah M 2 Hasan XL 3 Jyoti S 4 Jack L
We now have a dataframe containing the names and the corresponding t-shirt sizes of students in a university. The “Shirt Size” column is of category
type. Let’s print out the category column to see its data and the possible category values.
# display the "Shirt Size" column print(df["Shirt Size"])
Output:
0 S 1 M 2 XL 3 S 4 L Name: Shirt Size, dtype: category Categories (4, object): ['L', 'M', 'S', 'XL']
You can see that we get, “L”, “M”, “S”, and “XL” as the possible category values in the “Shirt Size” column. These values are inferred from the data during creation.
Remove category value from a categorical column
In the above dataframe, let’s remove “L” as a possible category value for the “Shirt Size” categorical column. For this, we apply the remove_categories()
function with the help of the .cat
accessor on the “Shirt Size” column and pass “L” as an argument.
# remove category value "L" from "Shirt Size" column df["Shirt Size"] = df["Shirt Size"].cat.remove_categories("L") # display the "Shirt Size" column print(df["Shirt Size"])
Output:
0 S 1 M 2 XL 3 S 4 NaN Name: Shirt Size, dtype: category Categories (3, object): ['M', 'S', 'XL']
You can see that now “L” is not a possible category value for the “Shirt Size” column. Note that the data having “L” as the value is now NaN.
Remove multiple categories from a categorical column
To remove multiple categories from a categorical field, pass the categories to remove as a list to the remove_categories()
function. Let’s remove “M” and “XL” as possible values from the “Shirt Size” column.
# remove categories "M" and "XL" from "Shirt Size" column df["Shirt Size"] = df["Shirt Size"].cat.remove_categories(["M", "XL"]) # display the "Shirt Size" column print(df["Shirt Size"])
Output:
0 S 1 NaN 2 NaN 3 S 4 NaN Name: Shirt Size, dtype: category Categories (1, object): ['S']
You can see that the “Shirt Size” column does not contain “M” and “XL” as possible category values. We now only have “S” as a possible category value because we removed “L” in the previous example and “M” and “XL” in this example.
Remove unused categories from a categorical column in Pandas
There’s an additional function that you can use for a specific use case. Removing unused category values from a category type column. Unused categories are values that are a part of the possible category values but do not occur in the data.
You can use the remove_unused_categories()
function to remove unused categories from a categorical field in Pandas. Its usage is similar to the remove_categories()
function. Let’s look at an example.
# series of shirt sizes shirt_sizes = pd.Series(pd.Categorical(["L", "M", "L", "M", "L"], categories=["S", "M", "L", "XL"])) # display the series print(shirt_sizes)
Output:
0 L 1 M 2 L 3 M 4 L dtype: category Categories (4, object): ['S', 'M', 'L', 'XL']
The above Pandas series is of category
type and has its set of possible values as “S”, “M”, “L”, and “XL”. If you look at the data in the series, the categories “S” and “XL” do not occur in the data. Let’s remove these categories as possible category values.
# remove unused categories shirt_sizes = shirt_sizes.cat.remove_unused_categories() # display the "series print(shirt_sizes)
Output:
0 L 1 M 2 L 3 M 4 L dtype: category Categories (2, object): ['M', 'L']
You can see that the resulting series doesn’t have any unused category values.
You might also be interested in –
- Get List of Categories in Pandas Category Column
- Pandas – Rename Categories in Category Column
- Pandas – Change Column Type to Category
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.