How to actually get Celery running with Django peacefully
Celery is a queue system written in Python that is commonly recommended for dealing with asynchronous task running with Django.
I use it in my own work for two things
- When a user requests the web application perform a lot of work, I will take that task and hand it off to Celery so that the application server doesn’t get clogged up, and so that the user’s page request returns in a satisfying timeframe and doesn’t time out.
- I run regularly scheduled jobs in my Django application, similar to Cron… Celery is convenient for this because it is application-aware.
For many years I ran Celery as a Docker container, and when you do it this way, it’s pretty straightforward because you’re calling the Celery executable directly in the container initialization step and as long as the container is still healthy you know celery is running.. easy.
If you want to not use Docker, then you need to set up a daemon. This is where it gets tricky, because Celery’s documentation kind of sucks. Editorial note here… a year or so ago Celery’s official website was completely taken offline for a period of weeks by a domain lapse… so… for a major piece of python open source software, these guys were never running a tight ship, and therefore it’s not surprising that their documentation is a little on the weak side.
The formal documenation for Daemonization is here.
Before I dive in let’s take a step back… you need to install Celery on your system.
I strongly recommend installing Celery using a virtualenv associated with your Django app. The reason is that you want Celery to be running with the same dependency stack as the code it’s going to be processing. Someone can come along and correct me and tell me that Celery can magically sort this out on its own, but the documentation doesn’t come out and scream that so I think it’s better to be safe than sorry.
That being said, the daemonization guide they offer makes a lot of assumptions about your system and if anything doesn’t match up then you’re in for rough seas.
They offer a couple of different options for handling your daemon, I’m going to focus on the systemd
iteration, because that’s what I use, and it’s very easy to do other python-y service management with it (I run gunicorn with systemd as well).
Digging in, here’s what you need to know about these docs that are either wrong or things you may want to consider adjusting or at least have awareness of…
Configuration File
In the case of this demonstration, the configuration file is actually just a set of environment variables that you’re handing to systemd, and are not literal Celery configuration settings. The documentation recommends placing them in /etc/conf.d/
. I’m not exactly sure why it recommends this, but I personally prefer keeping my application config files in a cozy central location and then symlinking them anywhere else they need to be (or visa versa). You can make this file anywhere you want, as long as systemd can get read access to it. You set the location in EnvironmentFile
in the service definition.
Assumption of User/Group
The docs appear to assume you already have a user/group named celery, if you don’t go ahead and make one
/var/run
The configuration they recommend places the PID file in /var/run/celery
, however, in a standard Ubuntu installation, no such directory will exist or be created on your behalf. To bypass this, you need to add RuntimeDirectory=celery
to your service definition. This will automatically create the /var/run/celery
directory with the appropriate permissions. The implication here is that your RuntimeDirectory
must match your desired PID file path
You will need a somewhat similar accomodation for /var/log, which can be acheived by simply creating the logging subdirectory and giving the celery user ownership of said subdirectory.
WorkingDirectory
Fourth, the documentation’s service definition provides a WorkingDirectory
value of /opt/celery
but seems to provide absolutely no explanation of why it has chosen to do this. At least in the universe I live in, Celery wants to work in or near the python its going to be dealing with. Maybe they’ve got some magic to have this make sense, but they didn’t share it with us. I set WorkingDirectory
to the root folder of my Django project. In turn, this permits me to set the value of CeleryApp
in the configuration file to my package name where the celery app file can be discovered. (If you don’t know what I am talking about here, you have gone too far).
My best guess as to the purpose of /opt/celery
is that a lot of the other Celery tutorials are just using single python scripts as examples which might be neatly stored in /opt/celery
if you were following it to a tee.
A traditional place to store your Django application might be /opt/<your application path here>/
.
By taking these factors into consideration, you will have a better chance of actually getting the service running. After setting up an entire stack on Ubuntu 22.04, Celery was by far the most annoying thing to get running properly, but now it is, so I felt it was worth a note.