When we work with larger datasets in Jupyter, we will notice a slowdown in the execution time. Let us look at 4 magic commands that Jupyter offers us to check the performance of our statements.
This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.
%time
With the magic command %time we can get the time it takes to run a statement:
1 2 3 4 5 6 |
import time def slow(): time.sleep(5) %time slow() |
The numbers vary between each run, but it should take at least 5 seconds and a bit of overhead for the method call:
%timeit
One measurement for the execution time is no use for performance related measurements. We need multiple runs to figure out the real values without interference of other activities on your computer. Therefore, %timeit will run your statement multiple times:
1 |
%timeit slow() |
This time we must wait longer to get a result, because %timeit runs our code 7 times instead of the one run for %time:
For much faster statements %timeit will increase the loops per run to 1 million to get a useful measurement. Be aware of that, especially when you call external ressources.
%memit
Before we can figure out how much memory our statement needs, we must install a memory profiler:
1 |
pip install memory_profiler |
After restarting JupyterLab we can load the memory profiler and then run %memit to collect the metrics for memory usage of our code:
1 2 3 4 |
%load_ext memory_profiler %memit slow() |
When I run the code in Jupyter, it used 2.54 MiB of additional Memory and the whole Jupyter process peaked at 133.31 MiB:
Keep in mind that this in just one run and that the memory consumption may change with the next one. Nevertheles, it is a good indicator on what goes on memory-wise with your statement. For additional details you can check the IPython section of the memory profiler documentation.
%prun
Especially when you run a more realistic code example, you will have multiple function calls along the way. With %prun we can dig deeper and see exactly where we lose time. For that we create another function called slower(), that will call slow():
1 2 3 4 |
def slower(): time.sleep(1) slow() time.sleep(2) |
We can inspect our code by adding %prun in front of our statement:
1 |
%prun slower() |
When I profiled the code, it took 8 seconds to run slower() and 5 seconds where spend with running slow():
The table gives us a good indicator on how often a function got called and the time spend there. But as with %memit, we only have one run and the numbers may change the next time you measure it.
Next
With the 4 magic commands for Jupyter we can get a glimpse in the performance metrics of our code. While it is in no way a dependable benchmark, it gives us enough details to see if we should optimise our statements.
Next week we explore the various ways to create a DataFrame in Pandas.
2 thoughts on “Python Friday #184: Performance-Related Magic Commands in Jupyter”