Python Friday #104: Creating a Pivot Table in Pandas

Last week we looked at how to import data into pandas. With this problem solved, we can now move on to the more interesting task of converting our data into a pivot table.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

 

Reminder: The sample data

The data I work with looks like this:

 

The pivot() method

On the data frame is the pivot() method that will transform our data as we need it. I want to use the ‘project’ column as the index (top to down), the ‘month’ as columns (left to right) and the ‘loc’ column as its values:

If we run this code, we get this pivot table:

We can now see how the projects increased in lines of code over time.

 

The pivot_table() method

We often not only need a pivot table but some calculations over the rows and columns. For this requirement we can use the pivot_table() method in pandas and create additional columns and rows for grand totals around our data frame (those extra columns are called margin).

The first part of the method call is the same as to pivot(), the parameters after that use the sum function to create a grand total over the pivot table:

This code adds the additional columns we often need in a pivot table:

We can use other aggregate functions like min or max if we want to find the extreme values in every column and row.

 

Conclusion

Pandas allows us to create a pivot table with a single command. It looks like magic but when you use it with your own data you quickly get the expected result, especially when you know how to create pivot tables in Excel. Try it!

2 thoughts on “Python Friday #104: Creating a Pivot Table in Pandas”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.