Python Friday #167: Often used Diagrams for Matplotlib

Different types of data need different forms for a graphical representation. In this post we explore the most often used types of diagrams in Matplotlib.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

Line plot / Line chart

The line chart is the diagram we used in the last posts and we can create it with this code:

import matplotlib as mpl
import matplotlib.pyplot as plt

x = [1, 2, 3, 4]
y = [0, 6, 1, 2]
fig, ax = plt.subplots()
ax.plot(x, y)

import matplotlib as mpl

import matplotlib.pyplot as plt

x = [1, 2, 3, 4]

y = [0, 6, 1, 2]

fig, ax = plt.subplots()

ax.plot(x, y)

This gives us the familiar looking plot:

A line plot

Bar chart

If we have categorical values, we can use this code to create a bar chart:

month = ['January', 'February', 'March', 'April', 'May', 'June']
temp_average = [3.4, 5.2, 10.3, 14.5, 18.6, 22.5]

fig, ax = plt.subplots()
ax.bar(month, temp_average)

month = ['January', 'February', 'March', 'April', 'May', 'June']

temp_average = [3.4, 5.2, 10.3, 14.5, 18.6, 22.5]

fig, ax = plt.subplots()

ax.bar(month, temp_average)

This gives us a bar chart with vertical bars:

A vertical aligned bar chart

If we switch from ax.bar() to ax.barh() we change the orientation from vertical to horizontal:

month = ['January', 'February', 'March', 'April', 'May', 'June']
temp_average = [3.4, 5.2, 10.3, 14.5, 18.6, 22.5]

fig, ax = plt.subplots()
ax.barh(month, temp_average)

month = ['January', 'February', 'March', 'April', 'May', 'June']

temp_average = [3.4, 5.2, 10.3, 14.5, 18.6, 22.5]

fig, ax = plt.subplots()

ax.barh(month, temp_average)

A bar chart with horizontal bars

Pie chart

We can create a pie chart with this code:

labels = 'Cats', 'Dogs'
sizes = [65, 30]

fig, ax = plt.subplots()
ax.pie(sizes, labels=labels)

labels = 'Cats', 'Dogs'

sizes = [65, 30]

fig, ax = plt.subplots()

ax.pie(sizes, labels=labels)

The result is a circle with proportional coloured segments that correspond to our values:

A pie chart for cats and dogs.

Scatter plot

The scatter plot draws a point for each data point we have:

x = [1, 2, 3, 4, 3, 2, 2.5]
y = [0, 6, 1, 2, 5, 4, 2.5]

fig, ax = plt.subplots()
ax.scatter(x, y)

x = [1, 2, 3, 4, 3, 2, 2.5]

y = [0, 6, 1, 2, 5, 4, 2.5]

fig, ax = plt.subplots()

ax.scatter(x, y)

Instead of lines or bars we get dots representing our data:

Each data point is represented with a dot.

In the documentation for Matplotlib you find a more interesting example of the possibilities you have with scatter plots. We can set the size and the colour to produce something like this:

A scatter plot with variable colours and sizes for the data points

Histograms

A histogram is a helpful way to see how the data is distributed:

data =  [1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5, 6, 6] 

fig, ax = plt.subplots()
ax.hist(data, bins=6)

data = [1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5, 6, 6]

fig, ax = plt.subplots()

ax.hist(data, bins=6)

This shows us that the numbers 3 and 5 are the most frequent in our data set:

The numbers 3 and 5 have the highest bars in the histogram

Box plot

The box plot is a helpful way to see the distribution of data:

data =  [1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6] 

fig, ax = plt.subplots()
ax.boxplot(data)

data = [1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6]

fig, ax = plt.subplots()

ax.boxplot(data)

This gives us a plot that can abstract a lot of values while it still shows us the variability of our data:

The box plot shows the range in which the values are

What does the plot tell us? The whiskers (the lines on top and at the bottom) represents the maximal and minimal values, while the line in the middle of the box is the median. The distance from the median to the edge of the box and from the box to the whiskers each cover a quarter of the values:

The box plot shows us a quarter of the data between the box and the whiskers and inside the box from the edge of the box to the median.

In our example we can see that we have more smaller values than higher values. For more details to box plots and how to interpret them you should read this article.

If we want to have a horizontal box plot, we need to set the value for vert to false (vert = vertical):

data =  [1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6] 

fig, ax = plt.subplots()
ax.boxplot(data, vert=False)

data = [1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6]

fig, ax = plt.subplots()

ax.boxplot(data, vert=False)

The box plot is now horizontally aligned

With these different types of diagrams, we can illustrate most of the data we work with. If it is not enough, there are plenty more visualisations to choose from in the documentation. Next week we explore ways to customise how our plots get drawn.

Python Friday #167: Often used Diagrams for Matplotlib

Line plot / Line chart

Bar chart

Pie chart

Scatter plot

Histograms

Box plot

Next

Like this:

Related

Leave a Comment Cancel reply

Line plot / Line chart

Bar chart

Pie chart

Scatter plot

Histograms

Box plot

Next

Share this:

Like this:

Related

Leave a Comment Cancel reply