Pandas – Count occurrences of value in a column

In this tutorial, we will look at how to count the occurrences of a value in a pandas dataframe column with the help of some examples.

Count occurrences of value in column

You can use the pandas series value_counts() function to count occurrences of each value in a pandas column. The following is the syntax:

# count occurrences of each value in a column
df[col].value_counts()
# count occurrences of a specific value in a column
df[col].value_counts()[value]

It returns a pandas series containing the counts of unique values.

Let’s look at some examples of using the value_counts() function to get the count of occurrences of values in a column.

First, we will create a sample dataframe that we will be using throughout this tutorial.

import pandas as pd

# create a dataframe
df = pd.DataFrame({
    'Olympics': [2008, 2008, 2012, 2012, 2012, 2016, 2016, 2016],
    'Event': ['100 m', '200 m', '100 m', '200 m', '4x100 m', '100 m', '200 m', '4x100 m'],
    'Medal': ['Gold', 'Gold', 'Gold', 'Gold', 'Gold', 'Gold', 'Gold', 'Gold']
})
# display the datafram
print(df)

Output:

   Olympics    Event Medal
0      2008    100 m  Gold
1      2008    200 m  Gold
2      2012    100 m  Gold
3      2012    200 m  Gold
4      2012  4x100 m  Gold
5      2016    100 m  Gold
6      2016    200 m  Gold
7      2016  4x100 m  Gold

We have created a dataframe storing the information on the Olympics performance of the legendary sprinter Usain Bolt. We will be using this dataframe throughout this tutorial.

Apply the pandas value_counts() function on the desired column. For example, let get the value counts in the “Event” column of the dataframe df. This will show the different events and their counts where Usain Bolt won an Olympics medal.

# count of events
print(df['Event'].value_counts())

Output:

100 m      3
200 m      3
4x100 m    2
Name: Event, dtype: int64

We get a pandas series with each unique value and its respective count in the “Event” column. You can see that Usain Bolt won three medals each in the “100 m” and the “200 m” event and two medals in the “4×100 m” event at the Olympics. Note that all these medals are gold medals.

At times you may want to know the proportion of each value in the column. For example, what proportion of Usain Bolt’s medals at the Olympics came from the “100 m” event. Pass normalize=True to the value_counts() function.

# proportion of events
print(df['Event'].value_counts(normalize=True))

Output:

100 m      0.375
200 m      0.375
4x100 m    0.250
Name: Event, dtype: float64

We now get the counts normalized as proportions.

The return value from the pandas value_counts() function is a pandas series from which you can access individual counts. For example, to just count the occurrences of “200 m” in the “Event” column –

# count of a specific value in column
print(df['Event'].value_counts()['200 m'])

Output:

3

Here, we get the count of medals won in the “200 m” category by Usain Bolt as 3.

For more on the pandas value_counts() function, refer to its documentation.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Leave a Reply

Your email address will not be published. Required fields are marked *