Pandas groupby is a great way to group values of a dataframe on one or more column values. When performing such operations, you might need to know the number of rows in each group. In this tutorial, we will look at how to count the number of rows in each group of a pandas groupby object.

## How to count rows in each group of pandas groupby?

You can use the pandas groupby `size()`

function to count the number of rows in each group of a groupby object. The following is the syntax:

df.groupby('Col1').size()

It returns a pandas series with the count of rows for each group.

It determines the number of rows by determining the size of each group (similar to how to get the size of a dataframe, e.g. *len(df)*) hence is not affected by NaN values in the dataset. That is, it gives a count of all rows for each group whether they are NaN or not. (See the examples below)

## Example

Let’s look at some examples of counting the number of rows in each group of a pandas groupby object. First, we will create a sample dataframe that we will be using throughout this tutorial.

import pandas as pd # create a dataframe df = pd.DataFrame({ 'Team' : ['A', 'A', 'B', 'B', 'B', 'C'], 'Points': [15, 18, 11, 17, 10, 13] }) # display the dataframe df

Output:

We now have a dataframe of the top scorers in a debating competition from different teams.

### Using pandas groupby `size()`

Let’s group the above dataframe on the column “Team” and get the number of rows in each group using the groupby `size()`

function.

# size of each group print(df.groupby('Team').size())

Output:

Team A 2 B 3 C 1 dtype: int64

You can see that we get the count of rows for each group. There are two rows for team A, three rows for team B, and one row for team C in the dataframe df.

### Using pandas groupby `count()`

You can also use the pandas groupby `count()`

function which gives the “count” of values in each column for each group. For example, let’s group the dataframe df on the “Team” column and apply the `count()`

function.

# count in each group print(df.groupby('Team').count())

Output:

Points Team A 2 B 3 C 1

We get a dataframe of counts of values for each group and each column. Note that counts are similar to the row sizes we got above. This is because there are no NaN values present in the dataframe.

Alternatively, you can use the pandas `value_counts()`

function if you’re grouping by a single column and want the counts.

# using value_counts() print(df['Team'].value_counts())

Output:

B 3 A 2 C 1 Name: Team, dtype: int64

### Difference between `size()`

and `count()`

Note that the `count()`

function is fundamentally different than the `size()`

function as it gives you the “count of values in each group” which may not necessarily be equal to the number of rows. This is because **the count() function will not count any NaN values it encounters.** Let’s see the difference between the two through an example.

First, let’s create a new dataframe so that it has some NaN values.

import numpy as np # create a dataframe df = pd.DataFrame({ 'Team' : ['A', 'A', 'B', 'B', 'B', 'C'], 'Points': [15, 18, 11, 17, np.nan, 13] }) # display the dataframe df

Output:

Now, let’s group the dataframe on “Team” and apply the `size()`

function.

# size of each group print(df.groupby('Team').size())

Output:

Team A 2 B 3 C 1 dtype: int64

Let’s group the dataframe on “Team” and apply the `count()`

function.

# count in each group print(df.groupby('Team').count())

Output:

Points Team A 2 B 2 C 1

You can see that with the count() function we only got the count of non-null values for each group whereas with the `size()`

function we got the actual number of rows for each group.

Thus, if you aim to get the number of rows in each group, use the `size()`

function on the groupby object.

For more on the pandas groupby size() function, refer to its documentation.

## Summary – Count Rows in Each Group

In this tutorial, we looked at how we can get the count of rows in each group of a groupby object in Pandas. The following are the key takeaways –

- Use the pandas groupby
`size()`

function to count rows for each group. It counts the rows irrespective of their values. - In certain use-cases, the pandas groupby
`count()`

can also be used but it only counts the non-Nan values in each group.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having numpy version 1.18.5 and pandas version 1.0.5

**Subscribe to our newsletter for more informative guides and tutorials. ****We do not spam and you can opt out any time.**

Tutorials on common row operations in pandas –