In this tutorial, we will look at the pandas dataframe describe() function with the help of some examples.

## What does describe() do in Pandas dataframe?

The pandas dataframe `describe()`

function is used to get the descriptive statistics for a dataframe. The following is the syntax –

# get dataframe's descriptive stats df.describe()

You can also apply the `describe()`

function to a pandas series.

The `describe()`

function takes the following arguments.

**percentiles**(*list or list-like of numbers*) – The percentiles (for numeric fields) to include in the result. The percentile values lie between 0 and 1 and by default, it includes the following percentiles`[0.25, 0.5, 0.75]`

.**includes**(*‘all’, None, or list-like of dtypes*) – Indicates which type of fields to include when generating the description. By default, it’s`None`

in which case, the description is generated only for numeric columns. You can use`'all'`

to include all the columns or pass a list of dtypes that you want to be included.**exclude**(*None, or list-like of dtypes*) – Indicates which fields to exclude when generating the description. By default, it’s`None`

, meaning don’t additionally exclude anything. You can also pass a list of dtypes that you want to be excluded.**datetime_is_numeric**(*bool*) – Whether to treat datetime fields (columns) as numeric types when generating the description. It is`False`

by default.

It returns the resulting descriptive statistics as a pandas dataframe (a pandas series if you apply it on a series).

## Examples

Let’s now look at some examples of using the above syntax to generate descriptions of some dataframes.

First, we will create a pandas dataframe that we will be using throughout this tutorial.

import pandas as pd # employee data data = { "Name": ["Jim", "Dwight", "Angela", "Tobi"], "Age": [26, 28, 27, 32], "Department": ["Sales", "Sales", "Accounting", "HR"], "Salary": [55000, 60000, 52000, 45000] } # create pandas dataframe df = pd.DataFrame(data) # display the dataframe df

Output:

Here, we created a dataframe containing information about some employees in an office. The dataframe has four columns – “Name”, “Age”, “Department” and “Employees”.

Let’s check the data type of the columns in the above dataframe. You can use the pandas dataframe `dtypes`

property.

# get column dtypes df.dtypes

Output:

Name object Age int64 Department object Salary int64 dtype: object

We get the dtype of each column in the above dataframe. You can see that the “Age” and “Salary” columns are of `int64`

type (they are numeric) and the “Name” and “Department” columns are of `object`

type (generally used for string and categorical fields).

Let’s now look at examples of using the `describe()`

function.

### Example 1 – Get statistics for only numeric columns using pandas `describe()`

The pandas dataframe `describe()`

function, by default, includes only the numeric columns when generating the dataframe’s description. (The default value for the `include`

parameter is `None`

).

Let’s apply the `describe()`

function on the above dataframe without any parameters (that is, using the default values of the parameters).

# get dataframe's descriptive statistics df.describe()

Output:

We get the description only for the numeric columns – “Age” and “Salary”. The result contains descriptive statistics like count, mean, min, max, standard deviation, and percentile values for the 25th, 50th, and 75th percentile.

### Example 2 – Get statistics for only non-numeric columns using pandas `describe()`

Let’s now get the statistics for only the `object`

type columns in the above dataframe.

Pass the dtypes you want to be included as a list to the `include`

parameter.

# get dataframe's descriptive statistics for non-numeric columns df.describe(include=['object'])

Output:

We get the description only for the `object`

type columns – “Name” and “Department”. The result contains statistics like count, unique values, top (the most frequent value), and freq (the count of the most frequent value in the column).

### Example 3 – Get the statistics for all the columns using `describe()`

To get the statistics for all the columns using the pandas dataframe `describe()`

function. Pass `include='all'`

.

# get dataframe's descriptive statistics for all columns df.describe(include='all')

Output:

We get the statistics for all the columns in the above dataframe.

You can see that this result is a sort of combination of the above two results.

## Summary

In this tutorial, we looked at how to get descriptive statistics for a dataframe using the describe() function in pandas. The following are the key takeaways –

- The
`describe()`

function, by default, generates the statistics for only the numeric columns. - To include specific column types in the result, pass the dtypes to include as a list to the
`include`

parameter. - If you want to get the statistics for all the columns, pass
`'all'`

to the`include`

parameter.

You might also be interested in –

- Pandas – Get Standard Deviation of one or more Columns
- Pandas – Get Mean of one or more Columns
- Pandas – Get DataFrame Size (With Examples)
- Pandas – Create DataFrame Copy
- Pandas – Get Value of a Cell in Dataframe

**Subscribe to our newsletter for more informative guides and tutorials. ****We do not spam and you can opt out any time.**