Split Pandas column of lists into multiple columns

When working with pandas dataframe, you may find yourself in situations where you have a column with values as lists that you’d rather have in separate columns. In this tutorial, we will look at how to split a pandas dataframe column of lists into multiple columns with the help of some examples.

Split pandas column of lists into separate columns

To split a pandas column of lists into multiple columns, create a new dataframe by applying the tolist() function to the column. The following is the syntax.

import pandas as pd
# assuming 'Col' is the column you want to split
df.DataFrame(df['Col'].to_list(), columns = ['c1', 'c2', 'c3'])

You can also pass the names of new columns resulting from the split as a list.

Let’s see it action with the help of an example. First, let’s create a dataframe with a column having a list of values for each row.

import pandas as pd

# create a dataframe
df = pd.DataFrame({
    'Name' : ['a', 'b', 'c'],
    'Values': [[1,2,3], [2,0,1], [3,2,0]]
})

# display the column
df

Output:

Dataframe with a column having values as lists

Now, let’s split the column “Values” into multiple columns, one for each value in the list.

# new df from the column of lists
split_df = pd.DataFrame(df['Values'].tolist())
# display the resulting df
split_df

Output:

The dataframe resulting from the split.

Here, we didn’t pass any column names, hence the column names are given by default. Let’s give specific column names to each of the new columns.

# new df from the column of lists
split_df = pd.DataFrame(df['Values'].tolist(), columns=['v1', 'v2', 'v3'])
# display the resulting df
split_df

Output:

Dataframe resulting from the split with the specified column names.

You may also want to contact the resulting dataframe from the split to the original dataframe. For this, use the pandas concat() function.

# new df from the column of lists
split_df = pd.DataFrame(df['Values'].tolist(), columns=['v1', 'v2', 'v3'])
# concat df and split_df
df = pd.concat([df, split_df], axis=1)
# display df
df

Output:

Dataframe resulting from the concat operation.

You may also want to drop the column “Values” now that it has been split into three columns.

# drop Values
df = df.drop('Values', axis=1)
# display df
df

Output:

Dataframe after dropping the column of lists

What would happen if you use the above method on a column which has lists of variable lengths?

Let’s see for ourselves.

# create a dataframe
df = pd.DataFrame({
    'Name' : ['a', 'b', 'c'],
    'Values': [[1,2,3], [2,0], [3,2,0]]
})

# display the column
df

Output:

Dataframe with the "Values" column containing lists of different lengths.

The column “Values” has lists of different lengths.

# new df from the column of lists
split_df = pd.DataFrame(df['Values'].tolist())
# display the resulting df
split_df

Output:

Dataframe resulting from the split

If the lists in the column are of different lengths, the resulting dataframe will have columns equal to the length of the largest list with NaNs in places where the function doesn’t find a list value.

Pandas series tolist() function

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.