Skip to Content

Pandas – Convert Category Type Column to Integer

In this tutorial, we will look at how to convert a category type column in Pandas to an integer type column with the help of some examples.

How to convert category to int type in Pandas?

You can use the Pandas astype() function to change the data type of a column. To convert a category type column to integer type, apply the astype() function on the column and pass 'int' as the argument. The following is the syntax –

# convert pandas column to int type
df["Col"] = df["Col"].astype("int") 

It changes the type of the column to int. Note that if the individual values in the column cannot be converted to integers, it will result in an error. For example, 1 and "1" can be converted to integer but "one" cannot be converted.

Examples

Let’s look at some examples of converting category type column(s) to integer type in Pandas. First, we will create a Pandas dataframe that we’ll be using throughout this tutorial.

import pandas as pd

# create a dataframe
df = pd.DataFrame({
        "Name": ["Tim", "Sarah", "Hasan", "Jyoti", "Jack"],
        "Class": [1, 2, 2, 3, 1]
})
# change to category dtype
df["Class"] = df["Class"].astype("category")
# display the dataframe
print(df)

Output:

    Name Class
0    Tim     1
1  Sarah     2
2  Hasan     2
3  Jyoti     3
4   Jack     1

We have a dataframe containing the names and the class (or grades) of some students in a primary school.

# display the "Class" column
print(df["Class"])

Output:

0    1
1    2
2    2
3    3
4    1
Name: Class, dtype: category
Categories (3, int64): [1, 2, 3]

The “Class” column in the above dataframe is of category type with individual values as integers.

Let’s convert this column from category to int type using the Pandas astype() function.

# category column to integer
df["Class"] = df["Class"].astype("int")
# display the "Class" column
print(df["Class"])

Output:

0    1
1    2
2    2
3    3
4    1
Name: Class, dtype: int64

You can see that the column is now of int64 type.

Category column with non-numerical values to integer

In the above example, the individual values in the category column were numeric. What if you try to convert a category column with non-numeric values to an int type column? Let’s find out.

# add a new column to store the class in words
df["Class2"] = ["First", "Second", "Second", "Third", "First"]
# convert column to category type
df["Class2"] = df["Class2"].astype("category")
# display the column
print(df["Class2"])

Output:

0     First
1    Second
2    Second
3     Third
4     First
Name: Class2, dtype: category
Categories (3, object): ['First', 'Second', 'Third']

We added an additional column to our dataframe. The “Class2” column stores the class values in words. For example, class 1 is stored as “First”, class 2 is stored as “Second” and so on.

Let’s try to convert the “Class2” column to integer type. We’ll use the same syntax as above.

# category column to integer
df["Class2"] = df["Class2"].astype("int")
# display the "Class2" column
print(df["Class2"])

Output:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [8], in <module>
      1 # category column to integer
----> 2 df["Class2"] = df["Class2"].astype("int")
      3 # display the "Class2" column
      4 print(df["Class2"])
...
ValueError: Cannot cast object dtype to int64

We get a ValueError because the values in the column cannot be converted to integers.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. In the past, he's worked as a Data Scientist for ZS and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.