In this tutorial, we will look at how to get the column names in a pandas dataframe that start with a specific string (in the column name) with the help of some examples.
How to find columns whose name starts with a specific string?
You can apply the string startswith()
function with the help of the .str
accessor on df.columns
to check if column names (of a pandas dataframe) start with a specific string.
You can use the .str
accessor to apply string functions to all the column names in a pandas dataframe.
Pass the start string as an argument to the startswith()
function. The following is the syntax.
# get column names that start with a specific string, s df.columns[df.columns.str.startswith(s)]
The idea is to get a boolean array using df.columns.str.startswith()
and then use it to filter the column names in df.columns
.
Alternatively, you can use a list comprehension to iterate through the column names and check if it starts with the specified string or not.
Examples
Let’s now look at some examples of using the above syntax.
First, we will create a pandas dataframe that we will be using throughout this tutorial.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
import pandas as pd # employee data data = { "Emp_Name": ["Jim", "Dwight", "Angela", "Tobi"], "Emp_Age": [26, 28, 27, 32], "Department": ["Sales", "Sales", "Accounting", "HR"] } # create pandas dataframe df = pd.DataFrame(data) # display the dataframe df
Output:
Here, we created a dataframe with information about some employees in an office. The dataframe has the columns – “Emp_Name”, “Emp_Age”, and “Department”.
Example 1 – Get column names that start with a specific string
Let’s get the column names in the above dataframe that start with the string “Emp_” in their column labels.
We’ll apply the string startswith()
function with the help of the .str
accessor to df.columns
.
# check if column name starts with the string, "Emp_" df.columns.str.startswith("Emp_")
Output:
array([ True, True, False])
You can see that we get a boolean array indicating which columns in the dataframe start with the string “Emp_”.
We can use the above boolean array to filter df.columns
to get only the columns that start with the specified string (in this example, “Emp_”)
# get column names that start with the string, "Emp_" df.columns[df.columns.str.startswith("Emp_")]
Output:
Index(['Emp_Name', 'Emp_Age'], dtype='object')
We get the column names starting with “Emp_” in the above dataframe.
Example 2 – Get column names that start with a specific string using list comprehension
Alternatively, we can use a list comprehension to iterate through the column names in df.columns
and select the ones that start with the given string.
# get column names that start with the string, "Emp_" [col for col in df.columns if col.startswith("Emp_")]
Output:
['Emp_Name', 'Emp_Age']
We get the column names that start with “Emp_”. The “Emp_Name” and the “Emp_Age” columns are the only ones that start with the string “Emp_” in the above dataframe.
Summary
In this tutorial, we looked at how to get the column names that start with a specified string in a pandas dataframe. The following are the key takeaways –
- Use the string
startswith()
function (applied using the.str
accessor ondf.columns
) to check if a column name starts with a given string or not (and use this result to filterdf.columns
). - You can also get column names that start with a specified string with the help of a list comprehension.
You might also be interested in –
- Pandas – Find Column Names that Contain Specific String
- Pandas – Get Columns with Missing Values
- Pandas – Add Prefix to Column Names
- Pandas – Add Suffix to Column Names
- Pandas – Change Column Names to Uppercase
- Pandas – Change Column Names to Lowercase
- Remove Prefix or Suffix from Pandas Column Names
- Get Column Names as List in Pandas DataFrame
- Pandas – Rename Column Names
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.