When you want to use Python for little maintenance tasks on your computer, you will need to work with files. Python offers you a range of built-in functions to do that and this post covers the most common ones.
However, this post does not cover working with binary files. You most likely will need a specific library to work with them and the documentation of that library can explain the details much better than I can.
This post is part of my journey to learn Python. You can find the other parts of this series here.
Reading and writing text files
In Python you can use the function open() and write() to write text to a file:
1 2 3 |
with open("hello.txt", "w") as f: f.write("Hello world!\n") f.write("a second line\n") |
You can use the same open() function and read() to read the content of a file:
1 2 3 |
with open("hello.txt", "r") as x: content = x.read() print(content) |
Hello world!
a second line
Even when it only takes 3 lines for each action, there is a lot going on. The with statement helps us to cover the exception handling, while the open function does all the work.
The parameter “r” and “w” in the open function are modes that tell open what we want to do with the file. If you type help(open) you can find the list of all supported modes and their meaning:
1 2 3 4 5 6 7 8 9 10 11 12 |
========= =============================================================== Character Meaning --------- --------------------------------------------------------------- 'r' open for reading (default) 'w' open for writing, truncating the file first 'x' create a new file and open it for writing 'a' open for writing, appending to the end of the file if it exists 'b' binary mode 't' text mode (default) '+' open a disk file for updating (reading and writing) 'U' universal newline mode (deprecated) ========= =============================================================== |
Error handling
Files that we open should be closed in all cases, otherwise we may lose data. If you use the with statement as I did in the examples above, you do not need to care about closing your file when an exception occurs. The with statement does all that for you. This little trick was introduced around 2005 in PEP 343 and I suggest you use it whenever you work with files.
If you look at other explanations on how to work with files, you may find code examples that look like this:
1 2 3 |
f = open("hello.txt", "w") f.write("Hello world!") f.close() |
Here you have no exception handling and your file stays open if something unexpected happens. To prevent that, you need to wrap the call to open() in a try/finally block:
1 2 3 4 5 |
try: f = open("hello.txt", "w") f.write("Hello world!") finally: f.close() |
This robust version is a lot longer than the one using the with statement. Therefore, you save yourself a lot of typing if you go for the more protected version.
Reading a file line by line
We can read a file line by line with the realine() method. To get all lines we call this method until we get nothing back:
1 2 3 4 5 6 7 |
with open("hello.txt", "r") as x: while True: line = x.readline() # stop when end of file is reached if not line: break print(line) |
If we just want the content of the file as a list of lines, the method readlines() is a lot simpler:
1 2 3 4 |
with open("hello.txt", "r") as x: lines = x.readlines() for line in lines: print(line) |
Fixing encoding problems
If we try to read a file that is not encoded in UTF-8 (or in any other encoding our operating system understands), our attempt to read a file may end with an exception:
1 2 3 4 |
>>> with open("umlaute.txt", "r") as r: ... content = r.read() ... print(content) ... |
Traceback (most recent call last):
File ““, line 1, in
File ““, line 3, in a
File “/usr/lib/python3.6/codecs.py”, line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xc4 in position 46: invalid continuation byte
Luckily for us, we can specify the encoding when we call open(). As soon as we set the correct encoding, Python can read our file:
1 2 3 4 5 |
>>> with open("umlaute.txt", "r", encoding="iso-8859-15") as r: ... content = r.read() ... print(content) ... Als Umlaut bezeichnet man auch die Buchstaben Ä/ä, Ö/ö, Ü/ü. |
Conclusion
Working with files in Python is not that complicated. When you use the with statement you prevent the most common errors and save a lot of typing.
3 thoughts on “Python Friday #16: Working With Files”