Pandas dataframes are great for analyzing and manipulating data. In this tutorial, we will look at how to get the max value in one or more columns of a pandas dataframe with the help of some examples.
If you prefer a video tutorial over text, check out the following video detailing the steps in this tutorial –
Pandas max()
function
You can use the pandas max()
function to get the maximum value in a given column, multiple columns, or the entire dataframe. The following is the syntax:
# df is a pandas dataframe # max value in a column df['Col'].max() # max value for multiple columns df[['Col1', 'Col2']].max() # max value for each numerical column in the dataframe df.max(numeric_only=True) # max value in the entire dataframe df.max(numeric_only=True).max()
It returns the maximum value or values depending on the input and the axis (see the examples below).
Examples
Let’s look at some use-case of the pandas max()
function. First, we’ll create a sample dataframe that we will be using throughout this tutorial.
import numpy as np import pandas as pd # create a pandas dataframe df = pd.DataFrame({ 'Name': ['Neeraj Chopra', 'Jakub Vadlejch', 'Vitezslav Vesely', 'Julian Weber', 'Arshad Nadeem'], 'Country': ['India', 'Czech Republic', 'Czech Republic', 'Germany', 'Pakistan'], 'Attempt1': [87.03, 83.98, 79.79, 85.30, 82.40], 'Attempt2': [87.58, np.nan, 80.30, 77.90, np.nan], 'Attempt3': [76.79, np.nan, 85.44, 78.00, 84.62], 'Attempt4': [np.nan, 82.86, np.nan, 83.10, 82.91], 'Attempt5': [np.nan, 86.67, 84.98, 85.15, 81.98], 'Attempt6': [84.24, np.nan, np.nan, 75.72, np.nan] }) # display the dataframe df
Output:
Here we created a dataframe containing the scores of the top five performers in the men’s javelin throw event final at the Tokyo 2020 Olympics. The attempts represent the throw of the javelin in meters.
1. Max value in a single pandas column
To get the maximum value in a pandas column, use the max() function as follows. For example, let’s get the maximum value achieved in the first attempt.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
# max value in Attempt1 print(df['Attempt1'].max())
Output:
87.03
We get 87.03 meters as the maximum distance thrown in the “Attemp1”
Note that you can get the index corresponding to the max value with the pandas idxmax() function. Let’s get the name of the athlete who threw the longest in the first attempt with this index.
# index corresponding max value i = df['Attempt1'].idxmax() print(i) # display the name corresponding this index print(df['Name'][i])
Output:
0 Neeraj Chopra
You can see that the max value corresponds to “Neeraj Chopra”.
2. Max value in two pandas columns
You can also get the max value of multiple pandas columns with the pandas min() function. For example, let’s find the maximum values in “Attempt1” and “Attempt2” respectively.
# get max values in columns "Attempt1" and "Attempt2" print(df[['Attempt1', 'Attempt2']].max())
Output:
Attempt1 87.03 Attempt2 87.58 dtype: float64
Here, created a subset dataframe with the columns we wanted and then applied the max() function. We get the maximum value for each of the two columns.
3. Max value for each column in the dataframe
Similarly, you can get the max value for each column in the dataframe. Apply the max function over the entire dataframe instead of a single column or a selection of columns. For example,
# get max values in each column of the dataframe print(df.max())
Output:
Name Vitezslav Vesely Country Pakistan Attempt1 87.03 Attempt2 87.58 Attempt3 85.44 Attempt4 83.1 Attempt5 86.67 Attempt6 84.24 dtype: object
We get the maximum values in each column of the dataframe df. Note that we also get max values for text columns based on their string comparisons in python.
If you only want the max values for all the numerical columns in the dataframe, pass numeric_only=True
to the max() function.
# get max values of only numerical columns print(df.max(numeric_only=True))
Output:
Attempt1 87.03 Attempt2 87.58 Attempt3 85.44 Attempt4 83.10 Attempt5 86.67 Attempt6 84.24 dtype: float64
4. Max value between two pandas columns
What if you want to get the maximum value between two columns?
You can do so by using the pandas max() function twice. For example, let’s get the maximum value considering both “Attempt1” and “Attempt2”.
# max value over two columns print(df[['Attempt1', 'Attempt2']].max().max())
Output:
87.58
We get 87.58 as the maximum distance considering the first and the second attempts together.
5. Max value in the entire dataframe
You can also get the single biggest value in the entire dataframe. For example, let’s get the biggest value in the dataframe df irrespective of the column.
# mav value over the entire dataframe print(df.max(numeric_only=True).max())
Output:
87.58
Here we apply the pandas max() function twice. First time to get the max values for each numeric column and then to get the max value among them.
For more on the pandas max() function, refer to its documentation.
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.