Python Friday #175: Visualise Data in Pandas With Plot()

We are back on our journey to data visualisation. Matplotlib offered us a lot of features, but especially the combination of multiple plots into one graphic is painful. With Pandas we get an abstraction of Matplotlib that works on the whole data frame. Let us explore the plotting capabilities we get in Pandas.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

 

Preparations

To show the power of Pandas for plotting, we need a data frame with data. We can use this list of mean daily maximum temperatures (in Celsius) for the cities Bern, Oslo, and Rome:

This gives us a data frame that looks like this one:

The temperatures for Bern, Oslo, and Rome are next to each other while the months from January to December are the Indexes (top down).

 

Line plot / Line chart

If we call the plot() method on the data frame without any options, it will create a line plot:

Without any additional work we get all three cities in the same plot:

The temperatures for the 3 cities are plotted in the same graphic with a legend that shows which colour belongs to which city.

If we only want to plot specific columns, we can create a list with their names and then filter the data frame:

We now only get the cities Bern and Rome in our plot:

Only Bern and Rome are part of the plot

 

Bar chart

To get a bar chart of our data frame, we can use the method df.plot(kind=”bar”) or the dedicated method df.plot.bar() – both produce the same result:

This gives us a bar chart with the temperatures for each month for all three cities:

We have a bar chart with bars going from top to bottom through the year with different colours for the cities.

If we want to get a horizontal bar chart, we can us either df.plot(kind=”barh”) or df.plot.barh():

This turns the bar chart on the side:

The bars now go from left to right, but start at the top with December and go down to January.

There is only one annoying detail with this chart: The months start with December. If you want a list from January to December instead of December to January, you can invert the axes:

This fixes the order of the months to something more familiar:

The bars start now with January.

 

Pie chart

For the pie chart we need a data frame with the categories we want in our plot:

This creates us a data frame with the two categories “Cats” and “Dogs”:

We get a value of 65 for cats and 30 for dogs.

We can now filter the data frame down to only one column and plot the pie chart with df.plot(kind=’pie’) or we use the df.plot.pie() method and filter by selecting a single column as the Y value:

Both approaches give us the same pie chart:

A pie chart with a larger value for cats than dogs.

 

Scatter plot

For the scatter plot we need a data frame with (at least) two columns:

This creates us a data frame like this:

A data frame with a column a and b.

We need to tell the df.plot.scatter() method what column will be our X value and what is column is the Y value for the dots:

This creates us a scatterplot with the 7 points of our data frame:

The 7 dots are placed in the scatter plot according the two columns of our data frame

 

Next

We can use the plot() method and specify the kind of plot we want or use a dedicated method to visualise our data frame. Next week we look at the slightly different options we have for histograms and box plots.

2 thoughts on “Python Friday #175: Visualise Data in Pandas With Plot()”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.