Save Pandas DataFrame to a CSV file

Pandas dataframes are used to store and manipulate two-dimensional tabular data in python. After having performed your pre-processing or analysis with your data, you may want to save it as a separate CSV (Comma Separated Values) file for future use or reference. The pandas to_csv() function is used to save a dataframe as a CSV file. In this tutorial, we’ll cover its usage along with some commonly used parameters through examples.

It is a pandas dataframe function used to save a dataframe as a CSV file. The following is its syntax:

df.to_csv(path)

The above syntax by default saves the index of the dataframe as a separate column. If you do not want to include the index, pass index=False to the above function.

Let’s create a sample dataframe first that we’ll be writing to the local machine as a CSV file.

import pandas as pd

data = {
    'Name': ['Microsoft Corporation', 'Google, LLC', 'Tesla, Inc.',\
             'Apple Inc.', 'Netflix, Inc.'],
    'Symbol': ['MSFT', 'GOOG', 'TSLA', 'AAPL', 'NFLX'],
    'Shares': [100, 50, 150, 200, 80]
}

df = pd.DataFrame(data)
df

A dataframe with name, stock symbol, and the respective shares count of companies in a sample portfolio:

A sample dataframe with three columns and five rows.

Let’s see a few examples of some of the different ways to_csv() is used:

df.to_csv("sample_portfolio.csv")

This is how the saved CSV file looks if we open it up in Excel:

Snapshot of the csv in excel. It shows all the dataframe columns with an addional column with index.

You can see in the above snapshot that using the to_csv() function with default parameters saves it along with an additional column for index.

Generally, you may not want to include the index of the dataframe as a separate column. Particularly when they’re just continuous numbers providing no additional information. For this, you can pass the parameter index=False to the to_csv() function.

df.to_csv("sample_portfolio.csv", index=False)

This is how the saved CSV file looks if we open it up in Excel:

Snapshot of the saved CSV in excel. It does not have index as a separate column.

You can see in the above snapshot that the saved CSV now does not have an additional column for index.

If you do not want to include column names in your saved CSV file pass header=False to the to_csv() function.

df.to_csv("sample_portfolio.csv", index=False, header=False)

This is how the saved CSV file looks if we open it up in Excel:

Snapshot of the saved CSV in excel, it doesn't have index and header.

Since we passed header=False the saved CSV file doesn’t have the column headers. You can also pass a custom list of column names to the header argument if you want columns to have different names. Note that we also passed index=False.

The to_csv() function also allows you the flexibility to choose the columns you want from the dataframe to be saved to the CSV file. You can pass the column names you want to include as a list to the columns argument.

df.to_csv("sample_portfolio.csv", index=False, columns=['Symbol', 'Shares'])

This is how the saved CSV file looks if we open it up Excel.

Snapshot of the saved CSV with column Symbol and Shares in excel.

In the above example, we passed the list of columns to be included in the CSV file as a list to the columns argument of the to_csv() function. You can see that only the columns passed: Symbol and Shares are present in the saved CSV file. Note that we also passed index=False.

For more on the pandas dataframe to_csv() refer to its official documentation.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.