Python Friday #185: Creating DataFrames in Pandas

While working on my upcoming blog post on filtering data in Pandas, I noticed a little gap in my knowledge: How can we create a DataFrame without the help of a CSV file? Let us find out what options we have.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

 

Turn a dictionary into a DataFrame

One straight-forward way to create a DataFrame is to use a dictionary. The key of the dictionary will be the name of the column, while the value (or list of values) will be put on separate rows below that column:

This gives us a two column wide and 3 rows long DataFrame:

The values for column A are vertically put below each other.

 

Turn a list of lists into a DataFrame

We can create a list containing other lists and turn that into a DataFrame:

We are free how we name the columns and the items in the list stay in the order in which we put them into the list(s):

The values of the lists keep their order in the mapping of the columns and the rows.

 

Turn a NumPy ndarray into a DataFrame

If we process data and already are familiar with NumPy, we can use the ndarray to turn our data into a DataFrame:

This takes the three values we have in each tuple and places them horizontally into our DataFrame:

The tuple (1,2,3) is placed in a way that 1 goes into the column x, 2 into y and 3 into z.

Be aware that this gives you a different placement for values than if you would use a dictionary.

 

Turn a CSV string into a DataFrame

If we have more data or really like CSV, we can create a special string with the StringIO class and put our CSV formatted values there. We then can use the read_csv() method on that string without the need to save our values into a file:

This code allows us to keep doing what we already know and turn CSV into a DataFrame:

The first row in the CSV string is used as a header, the rest of the values stay in the same order as we specified them in the string.

 

Next

These 4 ways allow us to create a DataFrame in Pandas without the need of an additional CSV file. With this new knowledge we can next week experiment with the various options of Pandas to filter data in a DataFrame.

1 thought on “Python Friday #185: Creating DataFrames in Pandas”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.