We recently moved one of our services from Rq to Celery. We are using an older version of Celery, but this comment should still apply. While Rq is a great way to go in the start, if you have a large number of messages you are handling, you'll get into memory bottlenecks. This isn't a problem with Rq, but with using Redis as a broker. So if anyone is considering Rq vs Celery, they should keep this in mind.
The reason we switched to Celery was the volume of messages we were handling. Since Rq relies on Redis, all your messages need to fit in memory. While Rq was great and simple to setup at start, as we grew we were consistently dealing with Rq breaking because Redis was full and stopped accepting any write operations.
We moved to Celery because it could use RabbitMQ as a broker. RabbitMQ offloads most messages to the disk which has nicely taken care of the memory limitation issues.
With Rq we would get stuck after 10K messages (our messages included images so individual message size was large). With RabbitMQ I've seen the queue grow to about 120K without so much as a single hiccup.
That's very insightful. Quick question, since you are doing this in production - how are you serializing images into a message ? Base64 or something else.
Not the person you're asking, but from the experiences I've had with Celery + other messaging queues: Don't pass around large binary blobs if you can avoid it. Whether that is an image or something else is irrelevant.
You can do such things as passing a database unique key, GUID, or file-path to the raw data on disk. Obviously, you will also need to engineer around that if you've got a distributed system. The tangential benefit is that you're not using a "messaging" queue or system for persisting or semi-persisting your image data. That's a big no-no as such systems are transient in nature and that often doesn't align with binary or image processing.
Base64 is for when you want to pass around binary in a text-based format. E.g. XML or JSON. But do keep in mind that because of the encoding format, converting binary data to base64 does increase the payload size by about 15% or so.
You really should not ram images in your task queue. Send them off to S3 or Cloud Storage with a GUID, and send along the GUID however you want (JSON is fine)
The reason we switched to Celery was the volume of messages we were handling. Since Rq relies on Redis, all your messages need to fit in memory. While Rq was great and simple to setup at start, as we grew we were consistently dealing with Rq breaking because Redis was full and stopped accepting any write operations.
We moved to Celery because it could use RabbitMQ as a broker. RabbitMQ offloads most messages to the disk which has nicely taken care of the memory limitation issues.
With Rq we would get stuck after 10K messages (our messages included images so individual message size was large). With RabbitMQ I've seen the queue grow to about 120K without so much as a single hiccup.