Skip to Content

Read Pickle File as a Pandas DataFrame

Python objects can be saved (or serialized) as pickle files for later use and since pandas dataframes are also python objects, you save them as pickle files. Generally, we use data stored in csv, excel, or text files to read as dataframes. In this tutorial, we’ll look at how to read a pickle file as a dataframe in pandas.

You can use the pandas read_pickle() function to read pickled pandas objects(.pkl files) as dataframes in python. Similar to reading csv or excel files in pandas, this function returns a pandas dataframe of the data stored in the file. The following is the syntax:

df = pd.read_pickle('my_data.pkl')

Here, “my_data.pkl” is the pickle file storing the data you want to read.

Exercise caution when working with pickle files. The pickle module is not secure. Only unpickle data you trust. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. See here.

Let’s look at an example of using the pandas read_pickle() function. First, we’ll create a sample dataframe that we’ll be saving locally as a pickle file using the pandas to_pickle() function.

import pandas as pd

data = {
    'Name': ['Microsoft Corporation', 'Google, LLC', 'Tesla, Inc.',\
             'Apple Inc.', 'Netflix, Inc.'],
    'Symbol': ['MSFT', 'GOOG', 'TSLA', 'AAPL', 'NFLX'],
    'Industry': ['Tech', 'Tech', 'Automotive', 'Tech', 'Entertainment'],
    'Shares': [100, 50, 150, 200, 80]
}

# create dataframe
df = pd.DataFrame(data)
# print dataframe
print(df)

Output:

                    Name Symbol       Industry  Shares
0  Microsoft Corporation   MSFT           Tech     100
1            Google, LLC   GOOG           Tech      50
2            Tesla, Inc.   TSLA     Automotive     150
3             Apple Inc.   AAPL           Tech     200
4          Netflix, Inc.   NFLX  Entertainment      80

The above is a pandas dataframe representing a sample stock portfolio. Let’s now go ahead and save this data as a pickle file locally, for this, we’ll be using the pandas to_pickle() function.

# save dataframe as a pickle file
df.to_pickle('portfolio.pkl')

Now that we have a dataframe saved as a pickle file with the name porfolio.pkl, we can go ahead and read it back as a dataframe using the pandas read_pickle() function.

# read pickle file as dataframe
df2 = pd.read_pickle('portfolio.pkl')
# print the dataframe
print(df2)

Output:

                    Name Symbol       Industry  Shares
0  Microsoft Corporation   MSFT           Tech     100
1            Google, LLC   GOOG           Tech      50
2            Tesla, Inc.   TSLA     Automotive     150
3             Apple Inc.   AAPL           Tech     200
4          Netflix, Inc.   NFLX  Entertainment      80

You can see that the pickle file data gets loaded successfully as a pandas dataframe. You can similarly use the pandas read_pickle() function to read dataframes that are serialized as pickle objects. But make sure you’re using pickle files from trusted sources only.

For more on the pandas read_pickle() function. Refer to its documentation.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. In the past, he's worked as a Data Scientist for ZS and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.