Celery Tutorial
Celery Tutorial
When you work on data-intensive applications, long-running tasks can seriously slow
down your users.
Modern users expect pages to load instantaneously, but data-heavy tasks may take many
seconds or even minutes to complete. How can we make sure users have a fast
experience while still completing complicated tasks?
Enter Workers + Message Queues
If we want users to experience fast load times in our application, we’ll need to offload
some of the work from our web server.
Workers
Photo by sol
One way we do this is with asynchronicity. While the webserver loads the next page, a
second server is doing the computations that we need in the background.
We call these background, task-based servers “workers.” While you typically only have
one or a handful of web servers responding to user requests, you can have many worker
servers that process tasks in the background.
These workers can then make changes in the database, update the UI via webhooks or
callbacks, add items to the cache, process files, send emails, queue future tasks, and
more! All while our main web server remains free to respond to user requests.
Message Queues
Photo by Melanie Pongratz
We tell these workers what to do via a message queue. Put simply, a queue is a first-in,
first-out data structure. When we store messages in a queue the first one we place in the
queue will be the first to be processed. All tasks will be started in the order we add them.
When a worker becomes available, it takes the first task from the front of the queue and
begins processing. If we have many workers, each one takes a task in order.
The queue ensures that each worker only gets one task at a time and that each task is
only being processed by one worker.
It takes care of the hard part of receiving tasks and assigning them appropriately to
workers.
Since we want Celery to have access to our database, models, and logic, we’ll define the
worker tasks inside of our Django application.
However, these tasks will not run on our main Django webserver. Instead, Celery will
manage separate servers that can run the tasks simultaneously in the background.
Since we need that queue to be accessible to both the Django webserver (to add new
tasks) and the worker servers (to pick up queued tasks), we’ll use an extra server that
works as a message broker.
That message broker server will use Redis — an in-memory data store — to maintain the
queue of tasks.
We can continue to add workers as the number of tasks increases, and each worker will
remove tasks from the queue in order — allowing us to process many tasks
simultaneously.
So, how does it actually work in practice? Here are the steps:
Requirements
Install Redis & start it up locally at port 6379
1. Set Up Django
Let’s create a new Django project to test out Celery:
cd celery_tutorial/
python manage.py migrate
Visit http://localhost:8000/
cd celery_tutorial/
We need to set up Celery with some config options. Create a new file called celery.py :
import os
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE',
'celery_tutorial.settings')
app = Celery('celery_tutorial')
This file creates a Celery app using the Django settings from our project. The last line
tells Celery to try to automatically discover a file called tasks.py in all of our Django
apps.
__all__ = ('celery_app',)
Finally, we need to tell Celery how to find Redis. So, open settings.py and add this line:
CELERY_BROKER_URL = 'redis://localhost:6379'
For simplicity, though, we’re going to create our first task in celery_tutorial/celery.py ,
so re-open that file and add this to the bottom:
@app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
This simple task just prints all the metadata about the request when the task is received.
It’s not a super useful task, but it will show us that Celery is working properly and
receiving requests.
Created a Celery app instance that manages all tasks in our application
Started Redis and gave Celery the address to Redis as our message broker
Created our first task so the worker knows what to do when it receives the task
request
Now, the only thing left to do is queue up a task and start the worker to process it.
We got back a successful AsyncResult — that task is now waiting in Redis for a worker to
pick it up!
Remember the task was just to print the request information, so this worker won’t take
long.
All we have to do is run Celery from the command line with the path to our config file.
Make sure you’re in the base directory (the one with manage.py ) and run:
You should see Celery start up, receive the task, print the answer, and update the task
status to “SUCCESS”:
-------------- [email protected] v4.4.2 (cliffs)
--- ***** -----
-- ******* ---- macOS-10.15.3-x86_64-i386-64bit 2020-04-18 20:41:52
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: celery_tutorial:0x1107b5a90
- ** ---------- .> transport: redis://localhost:6379//
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in
this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct)
key=celery
[tasks]
. celery_tutorial.celery.debug_task
[INFO/ForkPoolWorker-8] Task
celery_tutorial.celery.debug_task[fe261700-2160-4d6d-9d77-
ea064a8a3727] succeeded in 0.0015866540000000207s: None
Basically, no matter what cloud infrastructure you’re using, you’ll need at least 3 servers:
1. Django web server
The cool thing about Celery is its scalability. So you can add many Celery servers, and
they’ll discover one another and coordinate, using Redis as the communication channel.
This allows for a very high throughput of tasks. As you add more tasks to the queue (e.g.
from more users), you can add more worker servers to scale with demand.
Individual worker tasks can also trigger new tasks or send signals about their status to
other parts of the application. This means that decoupled, microservice-based
applications can use Celery to coordinate and trigger tasks across services.
As a Python developer, I don’t hear enough people talking about Celery and its
importance. I’m a huge fan of its simplicity and scalability.
About Bennett
I’m a software developer in New York City. I do web stuff in Python and JavaScript.
Like what you’ve read here? I have an email list you can subscribe to. Infrequent emails,
only valuable content, no time wasters. I’d love to have you there.
G t th M di
Get the Medium app