In this tutorial, we will look at how to delete all the columns of a pandas dataframe except one (or some) specific column(s) with the help of some examples.
How to delete all columns except one (or more) specific column(s)?
To delete all the columns except some specific ones from a pandas dataframe, you can filter the dataframe such that it contains only those specific columns.
The are multiple ways to filter a dataframe for specific columns in Pandas –
- You can use the square brackets notation to select the columns that you want. For example, if you want to retain only columns “Col1” and “Col2” from the dataframe
df
, usedf[['Col1', 'Col2']]
. - Using the pandas dataframe filter() function.
- Using
.loc
property to keep only the columns you want.
Examples
Let’s now look at some examples of using the above methods to remove all columns except some from a pandas dataframe.
For this tutorial, we will be using the Pokemon dataset to create our dataframe and show the usage of the different methods.
import pandas as pd # load the pokemon dataset df = pd.read_csv('data/Pokemon.csv') # display the dataframe head df.head()
Output:
Here, we loaded the Pokemon dataset as a dataframe using the read_csv()
function in pandas. You can see that the dataframe has 13 columns with different information and attributes about a pokemon.
Example 1 – Remove all columns except some using square bracket notation
Let’s remove all the columns except the “Name” column from the above dataframe using the square bracket notation.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
In this method, we use the list of columns we want to select (or retain) inside a pair of square brackets with the dataframe.
# keep only the "Name" column df = df[["Name"]] # display the dataframe head df.head()
Output:
You can see that the resulting dataframe contains only the “Name” column.
You can similarly use this method to remove all the columns except some specific ones. For example, let’s reload the dataset and remove all the columns except “Name” and “Type 1”.
# load the pokemon dataset df = pd.read_csv('data/Pokemon.csv') # keep only the "Name" and "Type 1" columns df = df[["Name", "Type 1"]] # display the dataframe head df.head()
Output:
The dataframe now has only the “Name” and “Type 1” columns.
Example 2 – Using the filter()
method
The pandas dataframe filter()
method is used to filter a dataframe on rows and/or columns.
Pass the list of column names you want to filter (or retain) as an argument to the filter()
method (it filters columns by default).
Let’s take the same use case from above. We’ll load the entire pokemon dataset and then remove all the columns except “Name” and “Type 1”.
# load the pokemon dataset df = pd.read_csv('data/Pokemon.csv') # keep only the "Name" and "Type 1" columns df = df.filter(["Name", "Type 1"]) # display the dataframe head df.head()
Output:
We get the same result as above.
Example 3 – Retain columns using the .loc
property
The loc
dataframe property is used to slice (or filter) a dataframe based on its row and column labels.
Let’s take the same use case from the above examples. We’ll load the entire pokemon dataset and then remove all the columns except “Name” and “Type 1”.
# load the pokemon dataset df = pd.read_csv('data/Pokemon.csv') # keep only the "Name" and "Type 1" columns df = df.loc[:, ["Name", "Type 1"]] # display the dataframe head df.head()
Output:
We get the same result as above.
Summary
In this tutorial, we looked at some methods to delete all the columns except some (one or more) columns from a pandas dataframe by filtering the dataframe. The following are the key takeaways –
- You can use the square bracket notation to select (or retain) the specific columns you want.
- Alternatively, you can use the pandas
filter()
function to filter the dataframe for specific columns. - You can also use the dataframe
.loc
property to filter a dataframe using column names.
You might also be interested in –
- Pandas – Filter DataFrame for multiple conditions
- Pandas – Delete rows based on column values
- Filter DataFrame rows on a list of values
- Pandas – Get dataframe summary with info()
- Pandas – Change Column Names to Uppercase
- Pandas – Change Column Names to Lowercase
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.