Histograms show the frequency distribution of values of a variable across different buckets. They are great for visualizing the distribution of a variable. In this tutorial, we’ll look at how to plot a histogram in python using matplotlib.
How to plot a histogram with matplotlib?
Matplotlib is a library in Python used for plotting visualizations and comes with a number of handy formatting and plot options. To plot a histogram you can use matplotlib pyplot’s hist()
function. The following is the syntax:
import matplotlib.pyplot as plt
plt.hist(x)
plt.show()
Here, x
is the array or sequence of values of the variable for which you want to construct a histogram. You can also specify the number of bins or the bin edges you want in the plot using the bins
parameter (see the examples below).
Examples
Let’s look at some of the examples of using the hist()
function to plot a histogram.
1. Histogram with default parameters in matplotlib
Let’s say you want to plot a histogram of the marks obtained by 100 students in a high school Math class. You can use matplotlib pyplot’s hist()
function for it. Let’s see what we get using just the default parameters.
import matplotlib.pyplot as plt
# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]
# plot histogram
plt.hist(history_scores)
plt.show()
Output:
This histogram somewhat resembles a normal distribution with a large number of students getting scores between 60 to 80 (closer to the mean) and the frequency tapering at both ends. We can add some basic formatting to the above plot such as axis labels and chart title to make it more clear.
import matplotlib.pyplot as plt
# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]
# plot histogram
plt.hist(math_scores)
# add formatting
plt.xlabel("Score")
plt.ylabel("Students")
plt.title("Histogram of scores in the Math class")
plt.show()
Output:
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
2. Histogram with probability densities instead of frequencies
You can change the values on the y-axis from frequencies to probabilities with each bin representing its probability density using the density
parameter which is False
by default.
import matplotlib.pyplot as plt
# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]
# plot histogram
plt.hist(math_scores, density=True)
# add formatting
plt.xlabel("Score")
# plt.ylabel("Students")
plt.title("Histogram of scores in the Math class")
plt.show()
Output:
In the above chart, each bin basically represents the “density” of the frequency concentrated in it. That is, for a bin, density = count inside the bin / (total count x bin width)
3. Histogram with custom bin counts
In the above examples, you can see that the hist() function, by default, uses 10 equal-width bins. You can specify your own bin count using the bins
parameter. For instance, if you want the histogram to have 20 bins:
import matplotlib.pyplot as plt
# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]
# plot histogram
plt.hist(math_scores, bins=20)
# add formatting
plt.xlabel("Score")
plt.ylabel("Students")
plt.title("Histogram of scores in the Math class")
plt.show()
Output:
You can see that at a higher bin size we get more thinner and granular bins. Also, note that except for the last bin the values in each bin include the lower bound and exclude the upper bound [include, exclude)
. For the final bin, both lower and upper bounds are included [include, include]
.
4. Histogram with custom bin edges
You can also specify your own bin edges which can be unequally spaced. For this, instead of passing an integer to the bins
parameter, pass a sequence with the bin edges. For example, if you want to have bins 0 to 20, 20 to 50, 50 to 70, 70 to 90, and 90 to 100 :
import matplotlib.pyplot as plt
# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]
# specify the bin edges
bin_edges = [0,20,50,70,90,100]
# plot histogram
plt.hist(math_scores, bins=bin_edges)
# add formatting
plt.xlabel("Marks in Math")
plt.ylabel("Students")
plt.title("Histogram of scores in the Math class")
plt.show()
Output:
Here, the bins are unequally spaced because of the bin edges specified. Matplotlib’s hist()
function also has a number of other parameters to customize your plots even further. For more, refer to its documentation.
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having matplotlib version 3.2.2
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.
Tutorials on matplotlib –
- Change Background Color of Plot in Matplotlib
- Change Font Size of elements in a Matplotlib plot
- Matplotlib – Save Plot as a File
- Change Size of Figures in Matplotlib
- Plot a Bar Chart using Matplotlib
- Plot a Pie Chart with Matplotlib
- Plot Histogram in Python using Matplotlib
- Create a Scatter Plot in Python with Matplotlib
- Plot a Line Chart in Python with Matplotlib
- Save Matplotlib Plot with Transparent Background
- Change Font Type in Matplotlib plots