Python Friday #103: Working With CSV Files in Pandas

Until now I use Excel to make a Pivot out of my data stored in CSV files. While this works, I need a lot of manual steps that are cumbersome and time-consuming. Pandas offers an automatable way to do this task, but before I can create the pivot, I need to import my data. Let us look how we can do this part with pandas.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

 

Pandas?

Pandas is a data analysis library for Python. It offers an enormous functionality and I only use a tiny bit for my tasks. If you want to know more about pandas, you should check the official documentation.

You can install pandas with this command:

 

My sample data

The data I want to import looks like this:

 

Reading CSV files

We can use the read_csv() method from pandas to create a data frame (df). The data frame is one of the basic objects in pandas on which you do all the work. If you use something else than a “,” (comma) to separate your data, you must set the delimiter argument – otherwise pandas will put all your data into one single column:

If you run the code from above, you should get this output:

 

Writing CSV files

Writing your data frame to a CSV file is similar to reading them. We must specify a file name, while everything else (like the delimiter) in the call to to_csv() is optional. For most exports you want to specify index=False to prevent the index column (that every data frame has) in your output file:

If we run this code our output file uses “,” (comma) as a delimiter:

 

Next

We now know how to get the data into pandas and back into a CSV file. Next week we turn our data into a pivot table.

1 thought on “Python Friday #103: Working With CSV Files in Pandas”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.