Python Friday #155: Download Jetpack Statistics With Playwright

With the newly gained knowledge about Playwright, we have everything together to go and automate a browser to do repetitive tasks. Let’s figure out how Playwright differs from Selenium when it comes to download Jetpack statistics.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

 

Back to Python Friday post #146

This post revisits the topic I covered in Python Friday #146: Download Jetpack Statistics with Selenium. However, this time we use Playwright instead of Selenium.

Check the older post for the prerequisites and how to create the .env file with your credentials. If you followed along the solution for Selenium, you can use the same .env file.

 

Start with the codegen template

For this post we use the starting template of the test recorder codegen. If you get stuck anywhere, start codegen, click through the application and then copy the generated code into your application.

I only changed the parameter of the run() method to accept a start_date and set that date in the sync_playwright() block:

 

Log in to WordPress.com

As with Selenium, we need to login with Playwright to WordPress.com. The login site uses a two-step form where the password field only appears after we entered the username. This transition can take time, which is the reason we add a few time.sleep() between the different parts:

The site selection at the end is something I need to do for the last few weeks. If you only have one blog, then you can remove this part.

 

Iterate through the days

Depending on how detailed you want to have the statistics, you best use the daily statistics and iterate through the days. Be aware that Jetpack returns the statistics of today if we ask for a date in the future. To prevent us from creating rubbish data, we need to check that we stay within a valid date range:

Today’s statistic can change until 23:59:59; therefore, we exclude the current day.

 

Download the statistics

We can find the URL for our blog statistics by clicking through WordPress.com. At the bottom of the page is a link with the text “Download data as CSV” that turns the displayed table into a CSV file:

At the bottom of the page is the download link we are looking for.

In our script we then tell our browser to access that statistics page for a specific day, let the page load, scroll at the end of the page, and then click on the download link:

 

Comparison of Playwright and Selenium

Solving the same problem with two different frameworks gives us a good idea of how they differ and what is the same. The solution for Playwright needs around 20 lines less code, mostly because the imports are fewer.

I used labels to select the elements in Playwright, while I preferred the ID selectors in Selenium. You can use either way to select elements in both frameworks.

For me the biggest difference between the two frameworks made codegen. When I got stuck writing my test, I could use codegen and record my interactions. Copying the result to the application saved me a lot of time. Therefore, I think Playwright has an advantage over Selenium when it comes to automating tasks in a browser.

 

Next

Next week we look how we can reuse Selenium Grid with Playwright and what we need to do to connect to BrowserStack.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.