Python Friday #122: Asynchronous Jobs With Celery

RabbitMQ is a great tool for asynchronous jobs. It works well but we need to do all the infrastructure parts on our own. Celery can run on top of RabbitMQ and offers us a lot more comfort for our asynchronous jobs. However, that comfort comes with the price of additional complexity that we need to handle. In this post we start with the basic set-up and over the next weeks we take a deeper look at the features of Celery.

This post is part of my journey to learn Python. You can find the other parts of this series here. You find the code for this post in my PythonFriday repository on GitHub.

Install Celery

Celery is a flexible distributed system for processing large volumes of messages. Its focus is on real-time processing, but it also supports task scheduling. In addition, it provides the necessary tools for the operation and maintenance of such a system.

We can install Celery with this command:

pip install -U Celery

1	pip install -U Celery

This command is enough for the parts I need. If you need multiple other tools to work with Celery, you should take a closer look at all the bundles they offer. That way you need to type less and get the matching packages.

I use RabbitMQ as a message broker. You find the installation instructions in my post on RabbitMQ.

Terminology

Celery uses a different terminology to RabbitMQ. A quick overview of the most important terms gives this snipped from the GitHub page that explains that Celery is a distributed task queue:

Task queues are used as a mechanism to distribute work across threads or machines.

A task queue’s input is a unit of work, called a task, dedicated worker processes then constantly monitor the queue for new work to perform.

Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task a client puts a message on the queue, the broker then delivers the message to a worker.

A Celery system can consist of multiple workers and brokers, giving way to high availability and horizontal scaling.

Define a task

As with RabbitMQ, we need two parts: one who creates a job (client) and one who processes the job (worker). In Celery we can create a task definition that we can reuse in the worker and the client. To see a noticeable difference on how we run our code, our prepare() method in the file celery_task.py sleeps for 5 seconds and then writes to the console:

import time
from celery import Celery
app = Celery('tasks', broker='amqp://guest@localhost//')

@app.task
def prepare(order_id):
    time.sleep(5) # simulate work
    print(f"order #{order_id} prepared")

if __name__ == '__main__':
    app.worker_main()

import time

from celery import Celery

app = Celery('tasks', broker='amqp://guest@localhost//')

@app.task

def prepare(order_id):

time.sleep(5) # simulate work

print(f"order #{order_id} prepared")

if __name__ == '__main__':

app.worker_main()

Queue a job

While our client can be any Python application, it can also run in the REPL. We need to import the prepare() method and can call it:

>>> from celery_task import prepare
>>> prepare(1)
order #1 prepared

>>> from celery_task import prepare

>>> prepare(1)

order #1 prepared

If you run this on your machine, you notice that it takes the 5 seconds of our sleep() call until you can enter the next command. This is because we execute the method directly in the REPL and Celery is not involved. That makes testing later on a lot simpler, then we can run Celery tasks without invoking Celery.

To run our method in Celery, we need to add .delay to our method name. This time we get back an object of the type AsyncResult without any waiting time:

>>> prepare.delay(2)
<AsyncResult: 700aae6c-3fcf-44cd-8c2c-cd89d82ada77>

1 2	>>> prepare.delay(2) <AsyncResult: 700aae6c-3fcf-44cd-8c2c-cd89d82ada77>

Our task now waits in the RabbitMQ queue, that we can inspect on the Docker container:

rabbitmqctl list_queues

1	rabbitmqctl list_queues

Timeout: 60.0 seconds …
Listing queues for vhost / …
name messages
celery 1

Start the worker

We can run this command in another console to start a Celery worker:

celery -A celery_task worker --loglevel=INFO

1	celery -A celery_task worker --loglevel=INFO

This command tells Celery to run the worker process with the task definition in the file celery_task. After the worker is initialised, you should get an output similar to this one:

[…] Connected to amqp://guest:**@127.0.0.1:5672//
[…] mingle: searching for neighbors
[…] mingle: all alone
[…] celery@XYZ ready.
[…] Task celery_task.prepare[700aae6c-3fcf-44cd-8c2c-cd89d82ada77] received
[…] order #2 prepared
[…] Task celery_task.prepare[700aae6c-3fcf-44cd-8c2c-cd89d82ada77] succeeded in 5.01600000000326s: None

We can see that Celery spend a bit over 5 seconds to process our task (mainly to the sleep(5) instruction). With this basic example we know that everything works together and we can now explore the many more interesting features of Celery.

If you follow along on Linux, you should get the same results as described here. On Windows this is a different story. You should get the async object back but in the Celery console you should not see any output of our task. Next week we take a look on how to solve this problem without changing your development environment to Linux.

3 thoughts on “Python Friday #122: Asynchronous Jobs With Celery”

Pingback: Python Friday #123: Running Celery on Windows – Improve & Repeat
Pingback: Python Friday #125: Logging in Celery – Improve & Repeat
Pingback: Python Friday #121: RabbitMQ and Python – Improve & Repeat

Python Friday #122: Asynchronous Jobs With Celery

Install Celery

Terminology

Define a task

Queue a job

Start the worker

Next

Like this:

Related

3 thoughts on “Python Friday #122: Asynchronous Jobs With Celery”

Leave a Comment Cancel reply

Install Celery

Terminology

Define a task

Queue a job

Start the worker

Next

Share this:

Like this:

Related

3 thoughts on “Python Friday #122: Asynchronous Jobs With Celery”

Leave a Comment Cancel reply