In this tutorial, we will look at how to get the column names in a pandas dataframe that contain a specific string (in the column name) with the help of some examples.
How to find columns whose name contain a specific string?
You can apply the string contains()
function with the help of the .str
accessor on df.columns
to get column names (of a pandas dataframe) that contain a specific string.
You can use the .str
accessor to apply string functions to all the column names in a pandas dataframe.
Pass the string you want to check for as an argument to the contains()
function. The following is the syntax.
# get column names containing a specific string, s df.columns[df.columns.str.contains(s)]
The idea is to get a boolean array using df.columns.str.contains()
and then use it to filter the column names in df.columns
.
Alternatively, you can use a list comprehension to iterate through the column names and check if it contains the specified string or not.
Examples
Let’s now look at some examples of using the above syntax.
First, we will create a pandas dataframe that we will be using throughout this tutorial.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
import pandas as pd # employee data data = { "First Name": ["Jim", "Dwight", "Angela", "Tobi"], "Last Name": ["Halpert", "Schrute", "Martin", "Flenderson"], "Age": [26, 28, 27, 32] } # create pandas dataframe df = pd.DataFrame(data) # display the dataframe df
Output:
Here, we created a dataframe with information about some employees in an office. The dataframe has the columns – “First Name”, “Last Name”, and “Age”.
Example 1 – Get columns names that contain a specific string
Let’s get the column names in the above dataframe that contain the string “Name” in their column labels.
We’ll apply the string contains()
function with the help of the .str
accessor to df.columns
.
# check if column name contains the string, "Name" df.columns.str.contains("Name")
Output:
array([ True, True, False])
You can see that we get a boolean array indicating which columns in the dataframe contain the string “Name”.
We can use the above boolean series to filter df.columns
to get only the columns that contain the specified string (in this example, “Name”)
# get column names that contain the string, "Name" df.columns[df.columns.str.contains("Name")]
Output:
Index(['First Name', 'Last Name'], dtype='object')
We get the column names with “Name” in them.
Example 2 – Get column names with a specific string using list comprehension
Alternatively, we can use a list comprehension to iterate through the column names in df.columns
and select the ones that contain the given string.
# get column names that contain the string, "Name" [col for col in df.columns if "Name" in col]
Output:
['First Name', 'Last Name']
We get the column names with “Name” in them. The “First Name” and the “Last Name” columns are the only ones with the string “Name” present in their names in the above dataframe.
Summary
In this tutorial, we looked at how to get the column names containing a specified string in a pandas dataframe. The following are the key takeaways –
- Use the string
contains()
function (applied using the.str
accessor ondf.columns
) to check if a column name contains a given string or not (and use this result to filterdf.columns
). - You can also get column names containing a specified string with the help of a list comprehension.
You might also be interested in –
- Pandas – Get Columns with Missing Values
- Pandas – Add Prefix to Column Names
- Pandas – Add Suffix to Column Names
- Pandas – Change Column Names to Uppercase
- Pandas – Change Column Names to Lowercase
- Remove Prefix or Suffix from Pandas Column Names
- Get Column Names as List in Pandas DataFrame
- Pandas – Rename Column Names
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.