Convert Pandas DataFrame to a Dictionary

The pandas dataframe to_dict() function can be used to convert a pandas dataframe to a dictionary. It also allows a range of orientations for the key-value pairs in the returned dictionary. In this tutorial, we’ll look at how to use this function with the different orientations to get a dictionary.

The following is the syntax:

d = df.to_dict(orient='dict')

Here, df is the dataframe you want to convert. The orient parameter is used to determine the orientation of the returned dictionary. Its default value is 'dict' which returns a dictionary in the form – {column: {index: value}}

Let’s look its usage through some examples:

First, let’s create a sample dataframe that we’ll be using throughout this tutorial.

# importing pprint to better print nested dictionaries
import pprint as pp
import pandas as pd

data = {
    'Name': ['Microsoft Corporation', 'Google, LLC', 'Tesla, Inc.',\
             'Apple Inc.', 'Netflix, Inc.'],
    'Symbol': ['MSFT', 'GOOG', 'TSLA', 'AAPL', 'NFLX'],
    'Shares': [100, 50, 150, 200, 80],
}

df = pd.DataFrame(data, index=['Row1', 'Row2', 'Row3', 'Row4', 'Row5'])

df

A dataframe representing a sample stock portfolio is created with the company name, stock symbol, and the shares count of the stocks in a portfolio. We also renamed the row indexes to better show (in subsequent examples) how the rows get represented in the dictionary returned by the the to_dict() function.

Snapshot of the sample dataframe to be used in this tutorial.

Now, let’s look at some of the different dictionary orientations that you can get using the to_dict() function.

Using the pandas dataframe to_dict() function with the default parameter for orient, that is, 'dict' returns a dictionary like {column: {index: value}}. See the example below –

# convert dataframe to dictionary
d = df.to_dict()
# print the dictionary
pp.pprint(d)

Output:

{'Name': {'Row1': 'Microsoft Corporation',
          'Row2': 'Google, LLC',
          'Row3': 'Tesla, Inc.',
          'Row4': 'Apple Inc.',
          'Row5': 'Netflix, Inc.'},
 'Shares': {'Row1': 100, 'Row2': 50, 'Row3': 150, 'Row4': 200, 'Row5': 80},
 'Symbol': {'Row1': 'MSFT',
            'Row2': 'GOOG',
            'Row3': 'TSLA',
            'Row4': 'AAPL',
            'Row5': 'NFLX'}}

In the above example, you can see the format of the dictionary returned. It has the column names as keys and the {index: value} mappings for that column as values.

If you want the returned dictionary to have the format {column: [values]}, pass 'list' to the orient parameter.

# convert dataframe to dictionary
d = df.to_dict(orient='list')
# print the dictionary
pp.pprint(d) 

Output:

{'Name': ['Microsoft Corporation',
          'Google, LLC',
          'Tesla, Inc.',
          'Apple Inc.',
          'Netflix, Inc.'],
 'Shares': [100, 50, 150, 200, 80],
 'Symbol': ['MSFT', 'GOOG', 'TSLA', 'AAPL', 'NFLX']}

In the above example, the returned dictionary has the column names as keys and the list of column values as the respective value for each key.

If you want the returned dictionary to have the format {column: Series(values)}, pass 'series' to the orient parameter.

# convert dataframe to dictionary
d = df.to_dict(orient='series')
# print the dictionary
pp.pprint(d)
# check the type of the value
print("\nThe type of values:",type(d['Shares']))

Output:

{'Name': Row1    Microsoft Corporation
Row2              Google, LLC
Row3              Tesla, Inc.
Row4               Apple Inc.
Row5            Netflix, Inc.
Name: Name, dtype: object,
 'Shares': Row1    100
Row2     50
Row3    150
Row4    200
Row5     80
Name: Shares, dtype: int64,
 'Symbol': Row1    MSFT
Row2    GOOG
Row3    TSLA
Row4    AAPL
Row5    NFLX
Name: Symbol, dtype: object}

The type of values: <class 'pandas.core.series.Series'>

In the above example, the returned dictionary has the column names as keys and pandas series of the column values as the respective value for each key.

Now, instead of columns, if you want the returned dictionary to have the dataframe indexes as keys, pass 'index' to the orient parameter. The returned dictionary has the format {index: {column: value}}

# convert dataframe to dictionary
d = df.to_dict(orient='index')
# print the dictionary
pp.pprint(d)

Output:

{'Row1': {'Name': 'Microsoft Corporation', 'Shares': 100, 'Symbol': 'MSFT'},
 'Row2': {'Name': 'Google, LLC', 'Shares': 50, 'Symbol': 'GOOG'},
 'Row3': {'Name': 'Tesla, Inc.', 'Shares': 150, 'Symbol': 'TSLA'},
 'Row4': {'Name': 'Apple Inc.', 'Shares': 200, 'Symbol': 'AAPL'},
 'Row5': {'Name': 'Netflix, Inc.', 'Shares': 80, 'Symbol': 'NFLX'}}

In the above example, you can see that the returned dictionary has row indexes as the keys and {column: value} mapping for that row as the respective dictionary value.

The to_dict() function also allows you split your dataframe with the returned dictionary having the format {'index': [index], 'columns': [columns], 'data': [values]}. For this, pass 'split' to the orient parameter.

# convert dataframe to dictionary
d = df.to_dict(orient='split')
# print the dictionary
pp.pprint(d)

Output:

{'columns': ['Name', 'Symbol', 'Shares'],
 'data': [['Microsoft Corporation', 'MSFT', 100],
          ['Google, LLC', 'GOOG', 50],
          ['Tesla, Inc.', 'TSLA', 150],
          ['Apple Inc.', 'AAPL', 200],
          ['Netflix, Inc.', 'NFLX', 80]],
 'index': ['Row1', 'Row2', 'Row3', 'Row4', 'Row5']}

In the above example, the returned dictionary is a result of splitting the dataframe into its individual components with the keys 'columns', 'data', and 'index' and their respective values as lists.

For more on the pandas dataframe to_dict() function, refer to its official documentation.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.