# Plot Histogram in Python using Matplotlib

Histograms show the frequency distribution of values of a variable across different buckets. They are great for visualizing the distribution of a variable. In this tutorial, we’ll look at how to plot a histogram in python using matplotlib.

Matplotlib is a library in Python used for plotting visualizations and comes with a number of handy formatting and plot options. To plot a histogram you can use matplotlib pyplot’s `hist()` function. The following is the syntax:

``````import matplotlib.pyplot as plt
plt.hist(x)
plt.show()``````

Here, `x` is the array or sequence of values of the variable for which you want to construct a histogram. You can also specify the number of bins or the bin edges you want in the plot using the `bins` parameter (see the examples below).

Let’s look at some of the examples of using the `hist()` function to plot a histogram.

Let’s say you want to plot a histogram of the marks obtained by 100 students in a high school Math class. You can use matplotlib pyplot’s `hist()` function for it. Let’s see what we get using just the default parameters.

``````import matplotlib.pyplot as plt

# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]

# plot histogram
plt.hist(history_scores)
plt.show()``````

Output:

This histogram somewhat resembles a normal distribution with a large number of students getting scores between 60 to 80 (closer to the mean) and the frequency tapering at both ends. We can add some basic formatting to the above plot such as axis labels and chart title to make it more clear.

``````import matplotlib.pyplot as plt

# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]

# plot histogram
plt.hist(math_scores)
plt.xlabel("Score")
plt.ylabel("Students")
plt.title("Histogram of scores in the Math class")
plt.show()``````

Output:

📚 Data Science Programs By Skill Level

Introductory

Intermediate ⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

You can change the values on the y-axis from frequencies to probabilities with each bin representing its probability density using the `density` parameter which is `False` by default.

``````import matplotlib.pyplot as plt

# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]

# plot histogram
plt.hist(math_scores, density=True)
plt.xlabel("Score")
# plt.ylabel("Students")
plt.title("Histogram of scores in the Math class")
plt.show()``````

Output:

In the above chart, each bin basically represents the “density” of the frequency concentrated in it. That is, for a bin, density = count inside the bin / (total count x bin width)

In the above examples, you can see that the hist() function, by default, uses 10 equal-width bins. You can specify your own bin count using the `bins` parameter. For instance, if you want the histogram to have 20 bins:

``````import matplotlib.pyplot as plt

# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]

# plot histogram
plt.hist(math_scores, bins=20)
plt.xlabel("Score")
plt.ylabel("Students")
plt.title("Histogram of scores in the Math class")
plt.show()``````

Output:

You can see that at a higher bin size we get more thinner and granular bins. Also, note that except for the last bin the values in each bin include the lower bound and exclude the upper bound `[include, exclude)`. For the final bin, both lower and upper bounds are included `[include, include]`.

You can also specify your own bin edges which can be unequally spaced. For this, instead of passing an integer to the `bins` parameter, pass a sequence with the bin edges. For example, if you want to have bins 0 to 20, 20 to 50, 50 to 70, 70 to 90, and 90 to 100 :

``````import matplotlib.pyplot as plt

# scores in the Math class
math_scores = [72, 41, 65, 63, 82, 63, 51, 57, 39, 63,
62, 68, 52, 76, 62, 73, 72, 73, 71, 62,
76, 53, 71, 79, 77, 35, 65, 59, 58, 70,
73, 69, 59, 75, 73, 63, 65, 81, 46, 59,
53, 71, 79, 80, 60, 60, 64, 40, 73, 75,
68, 58, 81, 65, 55, 62, 82, 47, 85, 62,
39, 77, 82, 78, 57, 58, 72, 75, 65, 68,
86, 49, 39, 64, 54, 68, 85, 77, 62, 53,
52, 76, 80, 84, 69, 61, 69, 65, 89, 97,
71, 61, 77, 40, 83, 52, 78, 54, 64, 58]

# specify the bin edges
bin_edges = [0,20,50,70,90,100]

# plot histogram
plt.hist(math_scores, bins=bin_edges)
plt.xlabel("Marks in Math")
plt.ylabel("Students")
plt.title("Histogram of scores in the Math class")
plt.show()``````

Output:

Here, the bins are unequally spaced because of the bin edges specified. Matplotlib’s `hist()` function also has a number of other parameters to customize your plots even further. For more, refer to its documentation.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having matplotlib version 3.2.2