average for each row in a pandas dataframe

Average for each row in Pandas Dataframe

In this tutorial, we will look at how to calculate the average for each row in a pandas dataframe with the help of some examples.

average for each row in a pandas dataframe

To get the average for each row in a pandas dataframe, use the pandas dataframe mean() function with axis=1. The following is the syntax:

# get mean for each row
df.mean(axis=1)

It returns the mean for each row with axis=1. Note that the pandas mean() function calculates the mean for columns and not rows by default. Thus, make sure to pass 1 to the axis parameter if you want the get the average for each row.

Let’s look at some examples of using the above syntax. First, we will create a dataframe that we will be using throughout this tutorial.

import pandas as pd

# create a pandas dataframe
scores_df = pd.DataFrame({
    'Name': ['Sam', 'Soniya', 'Neeraj'],
    'Maths': [49, 81, 83],
    'History': [88, 70, 76],
    'Science': [61, 76, 90]
})
# display the dataframe
print(scores_df)

Output:

     Name  Maths  History  Science
0     Sam     49       88       61
1  Soniya     81       70       76
2  Neeraj     83       76       90

We created a dataframe with three rows, each storing the scores of a student in the subjects – Maths, History, and Science.

To get the mean for each row in the dataframe, apply the pandas dataframe mean() function with axis=1. For example, let’s find the average score for each of the students in the dataframe scores_df

# get mean for each row
print(scores_df.mean(axis=1))

Output:

📚 Data Science Programs By Skill Level

Introductory

Intermediate ⭐⭐⭐

Advanced ⭐⭐⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

0    66.000000
1    75.666667
2    83.000000
dtype: float64

We get the mean for each row as a pandas series.

Let’s add a new column to the scores_df dataframe representing the mean scores for each student.

# add new column with average score of each student
scores_df['Average Score'] = scores_df.mean(axis=1)
# display the dataframe
print(scores_df)

Output:

     Name  Maths  History  Science  Average Score
0     Sam     49       88       61      66.000000
1  Soniya     81       70       76      75.666667
2  Neeraj     83       76       90      83.000000

By default, the pandas mean() function doesn’t take into account the NA values when computing the average. To demonstrate this, let’s create a scores dataframe with some missing values.

import numpy as np

# dataframe with some misssing values
scores_df = pd.DataFrame({
    'Name': ['Sam', 'Soniya', 'Neeraj'],
    'Maths': [49, np.nan, 83],
    'History': [np.nan, 70, 76],
    'Science': [61, np.nan, 90]
})
# display the dataframe
print(scores_df)

Output:

      Name  Maths  History  Science
0     Sam   49.0      NaN     61.0
1  Soniya    NaN     70.0      NaN
2  Neeraj   83.0     76.0     90.0

Now let’s see how the result will look like when getting the average for each row.

# add new column with average score of each student
scores_df['Average Score'] = scores_df.mean(axis=1)
# display the dataframe
print(scores_df)

Output:

      Name  Maths  History  Science  Average Score
0     Sam   49.0      NaN     61.0           55.0
1  Soniya    NaN     70.0      NaN           70.0
2  Neeraj   83.0     76.0     90.0           83.0

You can see that the average value for each row doesn’t take the NaN values into account.

If you want to include the NaN values when calculating the average, pass skipna=False to the pandas mean() function.

# add new column with average score of each student
scores_df['Average Score'] = scores_df.mean(axis=1, skipna=False)
# display the dataframe
print(scores_df)

Output:

      Name  Maths  History  Science  Average Score
0     Sam   49.0      NaN     61.0            NaN
1  Soniya    NaN     70.0      NaN            NaN
2  Neeraj   83.0     76.0     90.0           83.0

We get a NaN in the average if any of the values in the row is NaN.

For more on the pandas mean() function, refer to its documentation.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having numpy version 1.18.5 and pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

Scroll to Top