Python Friday #103: Working With CSV Files in Pandas

Until now I use Excel to make a Pivot out of my data stored in CSV files. While this works, I need a lot of manual steps that are cumbersome and time-consuming. Pandas offers an automatable way to do this task, but before I can create the pivot, I need to import my data. Let us look how we can do this part with pandas.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

Pandas?

Pandas is a data analysis library for Python. It offers an enormous functionality and I only use a tiny bit for my tasks. If you want to know more about pandas, you should check the official documentation.

You can install pandas with this command:

pip install pandas

1	pip install pandas

My sample data

The data I want to import looks like this:

month;project;loc
2021.01;A;100
2021.01;B;1000
2021.01;C;1100
2021.02;A;200
2021.02;B;2100
2021.02;C;1100
2021.03;A;500
2021.03;B;2100
2021.03;C;3100
2021.04;A;1000
2021.04;B;5000
2021.04;C;4100

month;project;loc

2021.01;A;100

2021.01;B;1000

2021.01;C;1100

2021.02;A;200

2021.02;B;2100

2021.02;C;1100

2021.03;A;500

2021.03;B;2100

2021.03;C;3100

2021.04;A;1000

2021.04;B;5000

2021.04;C;4100

Reading CSV files

We can use the read_csv() method from pandas to create a data frame (df). The data frame is one of the basic objects in pandas on which you do all the work. If you use something else than a “,” (comma) to separate your data, you must set the delimiter argument – otherwise pandas will put all your data into one single column:

import pandas as pd

df = pd.read_csv('input.csv', delimiter=';')
print(df)

import pandas as pd

df = pd.read_csv('input.csv', delimiter=';')

print(df)

If you run the code from above, you should get this output:

      month project   loc
0   2021.01       A   100
1   2021.01       B  1000
2   2021.01       C  1100
3   2021.02       A   200
4   2021.02       B  2100
5   2021.02       C  1100
6   2021.03       A   500
7   2021.03       B  2100
8   2021.03       C  3100
9   2021.04       A  1000
10  2021.04       B  5000
11  2021.04       C  4100

month project loc

0 2021.01 A 100

1 2021.01 B 1000

2 2021.01 C 1100

3 2021.02 A 200

4 2021.02 B 2100

5 2021.02 C 1100

6 2021.03 A 500

7 2021.03 B 2100

8 2021.03 C 3100

9 2021.04 A 1000

10 2021.04 B 5000

11 2021.04 C 4100

Writing CSV files

Writing your data frame to a CSV file is similar to reading them. We must specify a file name, while everything else (like the delimiter) in the call to to_csv() is optional. For most exports you want to specify index=False to prevent the index column (that every data frame has) in your output file:

df.to_csv('output_converted.csv', index=False)

1	df.to_csv('output_converted.csv', index=False)

If we run this code our output file uses “,” (comma) as a delimiter:

month,project,loc
2021.01,A,100
2021.01,B,1000
2021.01,C,1100
2021.02,A,200
2021.02,B,2100
2021.02,C,1100
2021.03,A,500
2021.03,B,2100
2021.03,C,3100
2021.04,A,1000
2021.04,B,5000
2021.04,C,4100

month,project,loc

2021.01,A,100

2021.01,B,1000

2021.01,C,1100

2021.02,A,200

2021.02,B,2100

2021.02,C,1100

2021.03,A,500

2021.03,B,2100

2021.03,C,3100

2021.04,A,1000

2021.04,B,5000

2021.04,C,4100

We now know how to get the data into pandas and back into a CSV file. Next week we turn our data into a pivot table.

Python Friday #103: Working With CSV Files in Pandas

Pandas?

My sample data

Reading CSV files

Writing CSV files

Next

Like this:

Related

1 thought on “Python Friday #103: Working With CSV Files in Pandas”

Leave a Comment Cancel reply

Pandas?

My sample data

Reading CSV files

Writing CSV files

Next

Share this:

Like this:

Related

1 thought on “Python Friday #103: Working With CSV Files in Pandas”

Leave a Comment Cancel reply