In this tutorial, we will look at how to count the occurrences of a value in a pandas dataframe column with the help of some examples.
Pandas value_counts()
function
You can use the pandas series value_counts() function to count occurrences of each value in a pandas column. The following is the syntax:
# count occurrences of each value in a column df[col].value_counts() # count occurrences of a specific value in a column df[col].value_counts()[value]
It returns a pandas series containing the counts of unique values.
Let’s look at some examples of using the value_counts() function to get the count of occurrences of values in a column.
First, we will create a sample dataframe that we will be using throughout this tutorial.
import pandas as pd # create a dataframe df = pd.DataFrame({ 'Olympics': [2008, 2008, 2012, 2012, 2012, 2016, 2016, 2016], 'Event': ['100 m', '200 m', '100 m', '200 m', '4x100 m', '100 m', '200 m', '4x100 m'], 'Medal': ['Gold', 'Gold', 'Gold', 'Gold', 'Gold', 'Gold', 'Gold', 'Gold'] }) # display the datafram print(df)
Output:
Olympics Event Medal 0 2008 100 m Gold 1 2008 200 m Gold 2 2012 100 m Gold 3 2012 200 m Gold 4 2012 4x100 m Gold 5 2016 100 m Gold 6 2016 200 m Gold 7 2016 4x100 m Gold
We have created a dataframe storing the information on the Olympics performance of the legendary sprinter Usain Bolt. We will be using this dataframe throughout this tutorial.
Count occurrences of each unique value in the Column
Apply the pandas value_counts() function on the desired column. For example, let get the value counts in the “Event” column of the dataframe df. This will show the different events and their counts where Usain Bolt won an Olympics medal.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
# count of events print(df['Event'].value_counts())
Output:
100 m 3 200 m 3 4x100 m 2 Name: Event, dtype: int64
We get a pandas series with each unique value and its respective count in the “Event” column. You can see that Usain Bolt won three medals each in the “100 m” and the “200 m” event and two medals in the “4×100 m” event at the Olympics. Note that all these medals are gold medals.
Count occurrences of values in terms of proportion
At times you may want to know the proportion of each value in the column. For example, what proportion of Usain Bolt’s medals at the Olympics came from the “100 m” event. Pass normalize=True
to the value_counts() function.
# proportion of events print(df['Event'].value_counts(normalize=True))
Output:
100 m 0.375 200 m 0.375 4x100 m 0.250 Name: Event, dtype: float64
We now get the counts normalized as proportions.
Count occurrences of a specific value in a column
The return value from the pandas value_counts() function is a pandas series from which you can access individual counts. For example, to just count the occurrences of “200 m” in the “Event” column –
# count of a specific value in column print(df['Event'].value_counts()['200 m'])
Output:
3
Here, we get the count of medals won in the “200 m” category by Usain Bolt as 3.
For more on the pandas value_counts() function, refer to its documentation.
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.