Select One or More Columns in Pandas

There are a number of ways in which you can select a subset of columns in pandas. You can select them by their names or their indexes. In this tutorial, we’ll look at how to select one or more columns in a pandas dataframe through some examples.

Let’s look at some of the different ways in which we can select columns of a dataframe using their names –

import pandas as pd

# create a sample dataframe
data = {
    'Name': ['Jim', 'Dwight', 'Angela', 'Tobi'],
    'Age': [26, 28, 27, 32],
    'Department': ['Sales', 'Sales', 'Accounting', 'Human Resources']
}

df = pd.DataFrame(data)

# select columns 'Name' and 'Department'
df_selected = df[['Name', 'Department']]

# print the dataframe
print("The original dataframe:\n")
print(df)
print("\nDataframe with the selected columns:\n")
print(df_selected)

Output:

The original dataframe:

     Name  Age       Department
0     Jim   26            Sales
1  Dwight   28            Sales
2  Angela   27       Accounting
3    Tobi   32  Human Resources

Dataframe with the selected columns:

     Name       Department
0     Jim            Sales
1  Dwight            Sales
2  Angela       Accounting
3    Tobi  Human Resources

In the above example, we select the columns Name and Department from the dataframe df by passing them as a list to the indexing operator []. You can see that the returned dataframe just has those two columns.

.loc is a pandas dataframe property used for accessing rows or columns of a dataframe by their labels. You can use it to select a subset of columns of a dataframe by their names.

import pandas as pd

# create a sample dataframe
data = {
    'Name': ['Jim', 'Dwight', 'Angela', 'Tobi'],
    'Age': [26, 28, 27, 32],
    'Department': ['Sales', 'Sales', 'Accounting', 'Human Resources']
}

df = pd.DataFrame(data)

# select columns 'Name' and 'Department'
df_selected = df.loc[:,['Name', 'Department']]

# print the dataframe
print("The original dataframe:\n")
print(df)
print("\nDataframe with the selected columns:\n")
print(df_selected)

Output:

The original dataframe:

     Name  Age       Department
0     Jim   26            Sales
1  Dwight   28            Sales
2  Angela   27       Accounting
3    Tobi   32  Human Resources

Dataframe with the selected columns:

     Name       Department
0     Jim            Sales
1  Dwight            Sales
2  Angela       Accounting
3    Tobi  Human Resources

In the above example, we use df.loc[:,['Name', 'Department']] to select columns Name and Department. Note that the : before the , is used so that we get all the rows for the two columns. You can give your specific slices based on what rows you require.

You can also select columns by giving their indexes using the .iloc property of the dataframe.

import pandas as pd

# create a sample dataframe
data = {
    'Name': ['Jim', 'Dwight', 'Angela', 'Tobi'],
    'Age': [26, 28, 27, 32],
    'Department': ['Sales', 'Sales', 'Accounting', 'Human Resources']
}

df = pd.DataFrame(data)

# select columns 'Name' and 'Department'
df_selected = df.iloc[:,[0, 2]]

# print the dataframe
print("The original dataframe:\n")
print(df)
print("\nDataframe with the selected columns:\n")
print(df_selected)

Output:

The original dataframe:

     Name  Age       Department
0     Jim   26            Sales
1  Dwight   28            Sales
2  Angela   27       Accounting
3    Tobi   32  Human Resources

Dataframe with the selected columns:

     Name       Department
0     Jim            Sales
1  Dwight            Sales
2  Angela       Accounting
3    Tobi  Human Resources

In the above example, we use the column indexes 0 and 2 to select columns Name and Department respectively from the dataframe df.

Refer to this guide for more on indexing and selecting data in pandas.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.