Python objects can be saved (or serialized) as pickle files for later use and since pandas dataframes are also python objects, you save them as pickle files. Generally, we use data stored in csv, excel, or text files to read as dataframes. In this tutorial, we’ll look at how to read a pickle file as a dataframe in pandas.
How to use pandas to read pickle files?
You can use the pandas read_pickle()
function to read pickled pandas objects(.pkl files) as dataframes in python. Similar to reading csv or excel files in pandas, this function returns a pandas dataframe of the data stored in the file. The following is the syntax:
df = pd.read_pickle('my_data.pkl')
Here, “my_data.pkl” is the pickle file storing the data you want to read.
Exercise caution when working with pickle files. The pickle module is not secure. Only unpickle data you trust. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. See here.
Example
Let’s look at an example of using the pandas read_pickle()
function. First, we’ll create a sample dataframe that we’ll be saving locally as a pickle file using the pandas to_pickle()
function.
import pandas as pd data = { 'Name': ['Microsoft Corporation', 'Google, LLC', 'Tesla, Inc.',\ 'Apple Inc.', 'Netflix, Inc.'], 'Symbol': ['MSFT', 'GOOG', 'TSLA', 'AAPL', 'NFLX'], 'Industry': ['Tech', 'Tech', 'Automotive', 'Tech', 'Entertainment'], 'Shares': [100, 50, 150, 200, 80] } # create dataframe df = pd.DataFrame(data) # print dataframe print(df)
Output:
Name Symbol Industry Shares 0 Microsoft Corporation MSFT Tech 100 1 Google, LLC GOOG Tech 50 2 Tesla, Inc. TSLA Automotive 150 3 Apple Inc. AAPL Tech 200 4 Netflix, Inc. NFLX Entertainment 80
The above is a pandas dataframe representing a sample stock portfolio. Let’s now go ahead and save this data as a pickle file locally, for this, we’ll be using the pandas to_pickle()
function.
# save dataframe as a pickle file df.to_pickle('portfolio.pkl')
Now that we have a dataframe saved as a pickle file with the name porfolio.pkl
, we can go ahead and read it back as a dataframe using the pandas read_pickle()
function.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
# read pickle file as dataframe df2 = pd.read_pickle('portfolio.pkl') # print the dataframe print(df2)
Output:
Name Symbol Industry Shares 0 Microsoft Corporation MSFT Tech 100 1 Google, LLC GOOG Tech 50 2 Tesla, Inc. TSLA Automotive 150 3 Apple Inc. AAPL Tech 200 4 Netflix, Inc. NFLX Entertainment 80
You can see that the pickle file data gets loaded successfully as a pandas dataframe. You can similarly use the pandas read_pickle()
function to read dataframes that are serialized as pickle objects. But make sure you’re using pickle files from trusted sources only.
For more on the pandas read_pickle() function. Refer to its documentation.
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.