The more complex the data, the more helpful a good visual representation is. Python offers us such a wide range of tools to visualise our data that it is difficult to make a choice. Therefore, we will look at a few different approaches to data visualisation in the coming months to use the right tool for our use cases.
This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.
Matplotlib
Matplotlib was first released 20 years ago and is the base of many newer visualisation tools in Python. That makes Matplotlib a great starting point into the topic of data visualisations.
Before we can start exploring this tool, we need to install it with pip:
1 |
pip install matplotlib |
The documentation is a must-read
While it is generally a good idea to read the documentation of a library, it is a must with Matplotlib. There are so many options, methods, and ways to format your diagrams that it is best to have the documentation within reach whenever you work with it.
On the examples section you can see the wide range of supported visualisations in Matplotlib that make it such a powerful but complex tool:
Use the cheat sheets and handouts
The official cheat sheet and the handouts for Matplotlib are a great addition to the documentation. You find everything on matplotlib.org/cheatsheets and they are a great help to quickly find the most important customisations. Make sure that you download the PDF version, this offers a much better experience than trying to decipher the PNG.
A first plot
When you work with data visualisation you often see the term “plot”. The definition by Wikipedia is as follow:
A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables.
We can use the basic example of the Matplotlib documentation for our first plot:
1 2 3 4 5 6 7 8 |
import matplotlib as mpl import matplotlib.pyplot as plt x = [1, 2, 3, 4] y = [0, 6, 1, 2] fig, ax = plt.subplots() # Create a figure containing a single axes. ax.plot(x, y) # Plot some data on the axes. |
When we run this code in JupyterLab, we get a plot like this one:
What is going on? We import matplotlib
and matplotlib.pyplot
to get the functionality of Matplotlib in our notebook. Then we define the x
and y
coordinates of four points we want to draw (1/0, 2/6, 3/1, 4/2
).
The call to plt.subplots() gives us a Figure and an Axes object back. The Axes object is the subplot on which we draw the x
and y
coordinates of the points.
Since we are in JupyterLab, we need no extra commands to see the graphical representation of the generated matplotlib.lines.Line2D
plot.
Show plots in the command line
If you run Matplotlib in the command line, you need to use the show() method to display the plot:
1 |
plt.show() |
This opens the plot in a new window where you can zoom in or save it:
Next
This was a quick run through the installation of Matplotlib and drawing the first graph. Over the next weeks we go deeper into the different parts that make Matplotlib so exciting. Next week we start with a closer look at figures and axes and how we can use them properly.
12 thoughts on “Python Friday #164: Visualise Data With Matplotlib”