Sometimes the collection of data takes a lot longer than processing it. Wouldn’t it be great if you could store whole Python objects? If something goes wrong, we can restart the analytics part and don’t need to recreate the data.
This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.
What is pickle?
The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. (Source: docs.python.org)
Pickle is a great helper to store (temporarily) your data. Be aware that a lot is going on when you unpickle data that could compromise your computer – therefore, only use it with trusted sources!
Save your objects
We can use the same example data from the post on PrettyPrinter:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
def example(): # You collect and combine data in an # arbitrarily nested data structure: data = [ { 'name': "Rebecca Stephenson", 'phone': "(154) 221-8558", 'zipCode': "900185", 'country': "South Korea", 'options': ['a','b','c'], 'total': "$74.79" }, { 'name': "Amos Nieves", 'phone': "1-762-301-2264", 'zipCode': "25566", 'country': "Russian Federation", 'options': { 'a': 'full', 'f': 'partial', 'c': {'k1': 1, 'k2': 3} }, 'total': "$21.78" } ] return data |
For pickle we need a file that we open in the binary mode. The dump() method does all the work and persists our objects:
1 2 3 4 5 6 7 8 |
import pickle def save(): data = example() # Use binary mode for your file with open("pickle.bin", "wb") as f: pickle.dump(data, f) |
The pickle.bin file now contains our objects in a binary format. If we open the file in a text editor, we see a lot of special characters:
Restore your objects
Pickle offers the load() method to restore our objects from a file (that we need to open in binary mode):
1 2 3 4 5 6 7 |
def restore(): # Use binary mode to read your file with open("pickle.bin", "rb") as f: data = pickle.load(f) for entry in data: for key in entry: print(f"{key} => {entry[key]}") |
For our application, the restored objects look the same as if we would create them in code. We can run our script and print the values to the command line or do whatever we want with them:
1 2 3 |
if __name__ == '__main__': save() restore() |
1 |
python .\pickle_example.py |
name => Rebecca Stephenson
phone => (154) 221-8558
zipCode => 900185
country => South Korea
options => [‘a’, ‘b’, ‘c’]
total => $74.79
name => Amos Nieves
phone => 1-762-301-2264
zipCode => 25566
country => Russian Federation
options => {‘a’: ‘full’, ‘f’: ‘partial’, ‘c’: {‘k1’: 1, ‘k2’: 3}}
total => $21.78
Conclusion
The pickle module in Python offers a built-in way to persist whole object graphs without much effort. If you need to temporarily store your objects you should try pickle.
1 thought on “Python Friday #94: Store Your Objects With Pickle”