Pandas – Get max value in one or more columns

Pandas dataframes are great for analyzing and manipulating data. In this tutorial, we will look at how to get the max value in one or more columns of a pandas dataframe with the help of some examples.

Max value in each pandas column

You can use the pandas max() function to get the maximum value in a given column, multiple columns, or the entire dataframe. The following is the syntax:

# df is a pandas dataframe

# max value in a column
df['Col'].max()
# max value for multiple columns
df[['Col1', 'Col2']].max()
# max value for each numerical column in the dataframe
df.max(numeric_only=True)
# max value in the entire dataframe
df.max(numeric_only=True).max()

It returns the maximum value or values depending on the input and the axis (see the examples below).

Let’s look at some use-case of the pandas max() function. First, we’ll create a sample dataframe that we will be using throughout this tutorial.

import numpy as np
import pandas as pd

# create a pandas dataframe
df = pd.DataFrame({
    'Name': ['Neeraj Chopra', 'Jakub Vadlejch', 'Vitezslav Vesely', 'Julian Weber', 'Arshad Nadeem'],
    'Country': ['India', 'Czech Republic', 'Czech Republic', 'Germany', 'Pakistan'],
    'Attempt1': [87.03, 83.98, 79.79, 85.30, 82.40],
    'Attempt2': [87.58, np.nan, 80.30, 77.90, np.nan],
    'Attempt3': [76.79, np.nan, 85.44, 78.00, 84.62],
    'Attempt4': [np.nan, 82.86, np.nan, 83.10, 82.91],
    'Attempt5': [np.nan, 86.67, 84.98, 85.15, 81.98],
    'Attempt6': [84.24, np.nan, np.nan, 75.72, np.nan]
})
# display the dataframe
df

Output:

Dataframe showing data on men's javeline final at the 2021 olympics

Here we created a dataframe containing the scores of the top five performers in the men’s javelin throw event final at the Tokyo 2020 Olympics. The attempts represent the throw of the javelin in meters.

To get the maximum value in a pandas column, use the max() function as follows. For example, let’s get the maximum value achieved in the first attempt.

# max value in Attempt1
print(df['Attempt1'].max())

Output:

87.03

We get 87.03 meters as the maximum distance thrown in the “Attemp1”

Note that you can get the index corresponding to the max value with the pandas idxmax() function. Let’s get the name of the athlete who threw the longest in the first attempt with this index.

# index corresponding max value
i = df['Attempt1'].idxmax()
print(i)
# display the name corresponding this index
print(df['Name'][i])

Output:

0
Neeraj Chopra

You can see that the max value corresponds to “Neeraj Chopra”.

You can also get the max value of multiple pandas columns with the pandas min() function. For example, let’s find the maximum values in “Attempt1” and “Attempt2” respectively.

# get max values in columns "Attempt1" and "Attempt2"
print(df[['Attempt1', 'Attempt2']].max())

Output:

Attempt1    87.03
Attempt2    87.58
dtype: float64

Here, created a subset dataframe with the columns we wanted and then applied the max() function. We get the maximum value for each of the two columns.

Similarly, you can get the max value for each column in the dataframe. Apply the max function over the entire dataframe instead of a single column or a selection of columns. For example,

# get max values in each column of the dataframe
print(df.max())

Output:

Name        Vitezslav Vesely
Country             Pakistan
Attempt1               87.03
Attempt2               87.58
Attempt3               85.44
Attempt4                83.1
Attempt5               86.67
Attempt6               84.24
dtype: object

We get the maximum values in each column of the dataframe df. Note that we also get max values for text columns based on their string comparisons in python.

If you only want the max values for all the numerical columns in the dataframe, pass numeric_only=True to the max() function.

# get max values of only numerical columns
print(df.max(numeric_only=True))

Output:

Attempt1    87.03
Attempt2    87.58
Attempt3    85.44
Attempt4    83.10
Attempt5    86.67
Attempt6    84.24
dtype: float64

What if you want to get the maximum value between two columns?
You can do so by using the pandas max() function twice. For example, let’s get the maximum value considering both “Attempt1” and “Attempt2”.

# max value over two columns
print(df[['Attempt1', 'Attempt2']].max().max())

Output:

87.58

We get 87.58 as the maximum distance considering the first and the second attempts together.

You can also get the single biggest value in the entire dataframe. For example, let’s get the biggest value in the dataframe df irrespective of the column.

# mav value over the entire dataframe
print(df.max(numeric_only=True).max())

Output:

87.58

Here we apply the pandas max() function twice. First time to get the max values for each numeric column and then to get the max value among them.

For more on the pandas max() function, refer to its documentation.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.