In this tutorial, we’ll look at how to filter pandas dataframe rows based on a list of values for a column.
How to filter a pandas dataframe on a set of values?
To filter rows of a dataframe on a set or collection of values you can use the isin()
membership function. This way, you can have only the rows that you’d like to keep based on the list values. The following is the syntax:
df_filtered = df[df['Col1'].isin(allowed_values)]
Here, allowed_values is the list of values of column Col1
that you want to filter the dataframe for. Any row with its Col1
value not present in the given list is filtered out.
Example
Let’s look at an example to see the filtering in action. For instance, you have data for the operators and store locations of a fast food joint and you only want to see the operators at a few specific locations :
import pandas as pd
# store locations of a fast-food joint
data = {
'Operator': ['Sam', 'Mike', 'Harvey', 'Susan', 'Jim', 'Kevin', 'Diane'],
'City': ['New York', 'Seattle', 'New York', 'Los Angeles', 'Scranton',
'Houston', 'Miami'],
}
# create a dataframe
store_df = pd.DataFrame(data)
print(store_df)
Output:
Operator City
0 Sam New York
1 Mike Seattle
2 Harvey New York
3 Susan Los Angeles
4 Jim Scranton
5 Kevin Houston
6 Diane Miami
Now, if you want to know the names operators only in New York and Los Angeles you can filter the above dataframe using isin
# filter for New York and Los Angeles
store_df_filtered = store_df[store_df['City'].isin(['New York', 'Los Angeles'])]
print(store_df_filtered)
Output:
Operator City
0 Sam New York
2 Harvey New York
3 Susan Los Angeles
Here, you can see that we passed the list of cities for which we wanted to filter the dataframe to isin
. The isin
membership function checks whether the value in the column City
is present in the passed list or not. This method is quite useful when you want to filter multiple values in a dataframe column
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
Also notice, that the filtered dataframe retains the index from the original dataframe. If you want to reset the index of the resulting dataframe, you can use the reset_index()
function to have a fresh index.
# reset the dataframe index
store_df_filtered = store_df_filtered.reset_index(drop=True)
print(store_df_filtered)
Output:
Operator City
0 Sam New York
1 Harvey New York
2 Susan Los Angeles
Note that we passed drop=True
to the reset_index()
function. This is done because we do not want the previous index as an additional column in our dataframe.
For more on isin and other python membership function refer to our guide on python
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.
More on Pandas DataFrames –
- Pandas – Sort a DataFrame
- Change Order of Columns of a Pandas DataFrame
- Pandas DataFrame to a List in Python
- Pandas – Count of Unique Values in Each Column
- Pandas – Replace Values in a DataFrame
- Pandas – Filter DataFrame for multiple conditions
- Pandas – Random Sample of Rows
- Pandas – Random Sample of Columns
- Save Pandas DataFrame to a CSV file
- Pandas – Save DataFrame to an Excel file
- Create a Pandas DataFrame from Dictionary
- Convert Pandas DataFrame to a Dictionary
- Drop Duplicates from a Pandas DataFrame
- Concat DataFrames in Pandas
- Append Rows to a Pandas DataFrame
- Compare Two DataFrames for Equality in Pandas
- Get Column Names as List in Pandas DataFrame
- Select One or More Columns in Pandas
- Pandas – Rename Column Names
- Pandas – Drop one or more Columns from a Dataframe
- Pandas – Iterate over Rows of a Dataframe
- How to Reset Index of a Pandas DataFrame?
- Read CSV files using Pandas – With Examples
- Apply a Function to a Pandas DataFrame