Boxplots are quite useful to visualize the spread in the data. They are particularly helpful in spotting outliers in the data. In this tutorial, we will look at how to create a boxplot from the values of a pandas series.
Pandas Series as Boxplot
To plot a pandas series, you can use the pandas series plot()
function. It plots a line chart of the series values by default but you can specify the type of chart to plot using the kind
parameter. To create a boxplot, pass 'box'
to the kind
paramter. The following is the syntax:
# boxplot using pandas series plot() s.plot(kind='box')
Here, s is the pandas series you want to plot. The pandas series plot()
function returns a matplotlib axes object to which you can add additional formatting.
Examples
Let’s look at some examples of plotting a pandas series values on a boxplot. First, we’ll create a sample pandas series which we will be using throughout this tutorial.
import pandas as pd # scores in the Math class math_scores = pd.Series(data=[72, 41, 65, 63, 82, 63, 51, 57, 39, 63, 62, 68, 52, 76, 62, 73, 72, 73, 71, 62, 76, 53, 71, 79, 77, 35, 65, 59, 58, 70, 73, 69, 59, 75, 73, 63, 65, 81, 46, 59, 53, 71, 79, 80, 60, 60, 31, 40, 73, 75, 68, 58, 81, 65, 55, 62, 82, 47, 85, 62, 39, 77, 82, 78, 57, 58, 72, 75, 65, 68, 86, 49, 39, 64, 54, 68, 85, 77, 62, 53, 52, 76, 80, 84, 69, 61, 69, 65, 89, 97, 71, 61, 77, 40, 83, 52, 78, 54, 64, 58], name='Scores') # display the series head print(math_scores.head())
Output:
0 72 1 41 2 65 3 63 4 82 Name: Scores, dtype: int64
You can see the top five values of the series object above. We now have a pandas series containing the scores of students in a Math class.
1. Boxplot of Series Values
To create a boxplot from the series values we’ll pass kind='box'
to the pandas series plot()
function. For example, let’s see its usage on the “math_scores” series created above.
math_scores.plot(kind='box')
Output:
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
The values within the edges of the boxplot depict values from the Q1 to the Q3 quartile of the series. The boxplot also shows the median value (Q2) of the series with a line inside the box. The whiskers extend from the box edges to no more than 1.5 * IQR (Inter Quartile Range = Q3 – Q1), values beyond the whiskers are depicted as dots representing the outliers.
In the above chart the green line depicting the median is around 65. Also notice that we have one outlier on the lower end of the scores represented by a dot.
2. Customize the plot formatting
You can also customize the formatting of the chart. For example, you can add the axes labels, chart title, etc. Since the returned plot is a matplotlib axes object, you can apply any formatting that would work with matplotlib charts. Let’s go ahead and add a title to our plot.
# create the boxplot ax = math_scores.plot(kind='box') # set the title ax.set_title("Spread of Math Scores")
Output:
For more on the pandas series plot() function, refer to its documentation.
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.
Tutorials on pandas series –
- Convert Pandas Series to a DataFrame
- Convert Pandas Series to a List
- Convert Pandas Series to a NumPy Array
- Convert Pandas Series to a Dictionary
- Sort a Pandas Series
- Append Two Pandas Series
- Apply a Function to a Pandas Series
- Pandas – Shift column values up or down
- Plot a Histogram of Pandas Series Values
- Create a Pie Chart of Pandas Series Values
- Plot a Bar Chart of Pandas Series Values
- Create a Boxplot from Pandas Series Values
- Create a Density Plot from Pandas Series Values