I once used a MySQL database as a replacement for a message queue. This was the ...

hardwaresofton · on June 4, 2023

See also: postgres with SKIP LOCKED.

Great talk on it from Citus Con 2022:

Queues in PostgreSQL | Citus Con: An Event for Postgres 2022 -- https://www.youtube.com/watch?v=WIRy1Ws47ic&list=PLlrxD0Htie...)

Citus is a horizontal scale out extension for postgres -- imagine the power of those two things together!

JK you don't have to imagine:

https://www.citusdata.com/blog/2018/01/24/citus-and-pg-partm...

andrewstuart · on June 4, 2023

I wrote an implementation in Python.

https://github.com/starqueue/starqueue

The code is in there for Postgres, MS SQL and MySQL (which all support SKIP LOCKED) though at some point I abandoned all but Postgres.

If I was to write another message queue then I wouldn’t use a database, I’d use the file system based around Linux file moves, which are atomic. What I really want is a message queue that is fast and requires zero config, file based message queues are both…. better than a database.

moron4hire · on June 4, 2023

I really feel like file systems aren't used for enough things. File systems come with so much useful metadata.

I've experimented with using the file system for storing configuration data where each config value is a single file. Nested structures just use directories. The name of the file is the field name. The file extension is a hint about the data type contained within (.txt is obvious, but I also liked .bool). Parsing data is trivial. I don't need any special viewing or editing tools, just a file manager and text editor. You can see when a specific config value was changed by checking the file update time. You don't have to load the whole configuration just to access one field. And you could conceivably TAR up the whole thing if you wanted to transmit it somewhere.

I use it to configure little sub-projects in my personal website. I really like it, but I shudder to think of the complaining I'd hear from other developers if I ever used it in a work project, just because it's not whatever they've ever seen before and would require a moment of two of thinking on their behalf to get over ingrained habits.

binwiederhier · on June 4, 2023

A company I used to work for extensively used this method. It's incredibly useful to be able to read a config or state value from any language and even bash scripts quickly.

However, and this is a big drawback, once you have too many config files and you start reading and writing from different processes, you get into bottleneck situations quickly.

moron4hire · on June 5, 2023

I haven't used this system extensively yet. But I don't really see how that situation gets improved by having a single-file configuration system.

First of all, if you have multiple processes trying to read/write the same config, that's kind of suspect, and if file I/O is a bottleneck for your config system, that's a different suspicious situation. Why are your processes writing to the config so... often?

But regardless, I can't see how those problems get immediately better by storing that config in a single file. If anything, having it split across multiple files would improve the situation, as different processes that might only be concerned with different sections of the config won't need to wait on file locks from process unrelated to their concerns.

binwiederhier · on June 5, 2023

I realize it may have sounded like I was suggesting the approach. Quite the contrary: I would never do that again or suggest it. I was merely pointing out that it was quite useful at times :-) We had lots of problems similar to what you were describing. Luckily that's all in the past now.

avereveard · on June 5, 2023

Sprinkle git on top of it and you also get your config versioned and distributed centrally in lock step with your software releases

Arnie97 · on June 7, 2023

The paradigm is also used by /proc and /sys, so I guess other developers won't get confused. However I never tried to tar -x into /proc to start the same set of processes on another node, or as an alternative to /etc/sysctl.conf :)

mixmastamyk · on June 5, 2023

This was tried and called Elektra I think around Y2K. Don’t believe the idea was even new then, but there was also research into tiny file performance at the time, resulting in things like reiserfs. I think it packed tiny files into the directory itself resulting in blistering speed.

Anyway it’s an elegant idea. Silly to have dozens of config file formats when the fs already has everything it needs. We have xattr too.

The flaw on the OS level is that it is hard to get everyone to change. For new apps not a problem, and any performance concerns are no longer an issue for config.

moron4hire · on June 5, 2023

Oh man, reiserfs. Seeing that name reminds me that the original developer, Hans Reiser, is currently spending time in prison for murdering his wife. She was the interpreter during his first meeting with a Russian "mail-order bride".

https://en.wikipedia.org/wiki/Hans_Reiser

wombatpm · on June 5, 2023

That was an ugly situation. PSA: Murdering your wife/exwife/girlfriend/baby mama is never the answer. Just let it go and move on.

bboygravity · on June 4, 2023

Sounds very much like how .ini files work and/or how regedit works on windows? :p

Not a bad idea at all, but perhaps re-inventing a very old wheel?

moron4hire · on June 4, 2023

It's not at all like INI files. INI files have field names and section delimiters in them. An INI files might be:

  [UI]
  DarkMode=1
  Theme=solarized
  [Options]
  Indent=4
  InventMode=tabs

Whereas this file system config would have individual files per value.

  <user-home>/.config/<app-name>/UI/DarkMode.bool --> 1

It's a single byte file. You read the entire file contents and if it's not zero, it's true. The existence of the file tells you whether or not to use a default value. An INI file would have to be fully parsed before we know whether it contains a value for that config value.

  <user-home>/.config/<app-name>/UI/Theme.txt --> solarized

You read the entire file contents, trim leading and trailing whitespace and toLower for good measure if you want, then validate against your list of installed themes. Done. No goofy JSON or YAML parser in site.

And what reinvention is there? If you're just using a system that already exists, you're not reinventing anything.

x-shadowban · on June 5, 2023

Sort of an homage to the windows registry, but it's not a "secret 3rd thing" it's just another folder.

djbusby · on June 4, 2023

I've done and do this too. Its worked awesome for 20+ years.

moron4hire · on June 4, 2023

Oh man, symlinks to share common configuration.

don-code · on June 4, 2023

I actually did this once (over SMB on Windows, though) - and unintentionally crippled our corporate SAN with all of its polling and locking activity. I had a cluster of 20 workers which would poll every five seconds for messages, and I believe we had an EMC VNX storage appliance. I never did figure out why that was enough to bring the whole thing to its knees, but IT was very quick to track the problem back to me.

gchq-7703 · on June 4, 2023

Interesting. What makes you want to switch to the file system? I wrote one for a project[0] a while back (for MongoDB) and it didn't seem like the database introduced too much complexity. I didn't write the implementation from scratch, but the couple hundred lines of code were easy to reason about.

[0] https://github.com/gchq/Bailo/tree/main/lib/p-mongo-queue

andrewstuart · on June 4, 2023

The filesystem means zero configuration.

I found almost all message queues to be horribly complex to configure, debug and run. Even database queues require a lot of config, relative to using the file system.

I did actually write a file system based message queue in Rust and it instantly maxed out the disk at about 30,000 messages a second. It did about 7 million messages a second when run as a purely RAM message queue but that didn’t use file system at all.

It depends what you’re doing of course… running a bank on a file system queue wouldn’t make sense.

A fast message queue should be a tiny executable that you run and you’re in business in seconds, no faffing around with even a minute of config.

I just hate configuration in general.

orangepurple · on June 4, 2023

> running a bank on a file system queue wouldn’t make sense

Bank transfers are done by a text file sent over FTP (file transfer protocol; circa 1971)

https://orientalbank.com/assets/Pdfs/ach/ACH_ORIGINATION_AGR...

lmz · on June 5, 2023

Some of them are more advanced and use SFTP (circa 1997).

Izkata · on June 4, 2023

> I did actually write a file system based message queue in Rust and it instantly maxed out the disk at about 30,000 messages a second. It did about 7 million messages a second when run as a purely RAM message queue but that didn’t use file system at all.

Did you try an in-memory filesystem through tmpfs?

sitkack · on June 5, 2023

Database config should be two connection strings, 1 for the admin user that creates the tables and anther for the queue user. Everything else should be stored in the database itself. Each queue should be in its own set of tables. Large blobs may or may not be referenced to an external file.

Shouldn't a message send be worst case a CAS. It really seems like all the work around garbage collection would have some use for in-memory high speed queues.

Are you familiar with the LMAX Disruptor? Is is a Java based cross thread messaging library used for day trading applications.

https://lmax-exchange.github.io/disruptor/

xvinci · on June 5, 2023

Since you seem to be from citusdata: I used cstore_fdw 2 - 3 years back and at least when paired with TPC-H it was horrendously broken for both small (10 gig) and large (100 gig) datasets. It has been integrated into some other product by the time being, I hope you managed to improve it.

hardwaresofton · on June 5, 2023

I'm definitely not from Citus data -- just a pg zealot fighting the culture war.

If you want to reach people who can actually help, you probably want to check this link:

https://github.com/citusdata/cstore_fdw/issues

Ozzie_osman · on June 4, 2023

This is actually pretty common, and usually a "good enough" solution. You can also add things like scheduling (add a run_at column), at least once execution (mark a row when it is being processed, delete it only when successful), topics, etc with minor modifications to your table.

If you want something that works "well enough" I'd say it's a reasonable choice.

larperdoodle · on June 4, 2023

Yeah, I'm using it as a transactional outbox to ensure at least once delivery to SNS.

Can't really think of a better way to ensure that a message is always sent if the DB transactions succeeds and is never sent if the DB transaction fails

Ozzie_osman · on June 4, 2023

You can get so far by ensuring at least once and making everything idempotent (will get you as close to "exactly once" as you can). With a database, the most common pattern is: insert the row for the job, when a worker starts working on it, mark it as in progress so it doesn't get started again, if the task fails, or after some reasonable time-out period, another worker can pick up the task again, ultimately the row for the task is only ever deleted when a worker successfully completes it.

waplot · on June 5, 2023

Nothing wrong with using the DB as a mq, especially if the load is small enough. Plenty of tools are built on that, these two come to mind

https://github.com/procrastinate-org/procrastinate

https://github.com/bensheldon/good_job

renewiltord · on June 4, 2023

Segment did so quite successfully https://segment.com/blog/introducing-centrifuge/

hu3 · on June 4, 2023

> We decided to store Centrifuge data inside Amazon’s RDS instances running on MySQL. RDS gives us managed datastores, and MySQL provides us with the ability to re-order our jobs.

Interesting. Thanks for sharing

pinkcan · on June 4, 2023

every half year or so I remember centrifuge, and get a little sad they didn't write more about it

glun · on June 15, 2023

If you don't want to publish events from uncommitted transactions you'll have to first store them in a local table and then move them to the queue after the commit. But if all consumers have direct access to the database anyway...

bob1029 · on June 4, 2023

I am doing the same with SQL Server. The messages table is more of a bus than a queue in our case (columns like ReplyToId, etc). Using it for RPC communication between cloud bits. Much cheaper than Azure Service Bus and friends.

Digit-Al · on June 4, 2023

Just out of interest: any reason you're not using the SQL server service broker?

nickpeterson · on June 4, 2023

I don’t know the OPs answer, but I’d hazard to guess because ssmb is a completely neglected feature with very little in the way of community or UI. In theory it would be great, but MS basically never invested in it after its release and now it’s just a random, “who knows when we’ll drop support for this” sql server feature.

bob1029 · on June 4, 2023

It's only available in managed instances or on-prem. We are trying to use the "serverless" flavor of Azure SQL throughout:

https://learn.microsoft.com/en-us/azure/azure-sql/database/f...

Digit-Al · on June 11, 2023

Ah, okay. Cheers for the reply.

UniverseHacker · on June 4, 2023

Why not sqlite with a lockfile? /s

moron4hire · on June 4, 2023

I briefly worked for a major corporation 15 years ago that did this with SQL Server to create distributed worker processes to handle all the AI-generated used car listings and photo recolorings [0] for almost all of the used car lots in the country.

[0] Why take hundreds of photos of Honda Civics in red, green, blue, and black when you already have a dozen in white?

dylan604 · on June 4, 2023

Why even take the dozen in white when they have a model you can render in any manner? Most car commercials do not have real cars in them. Maybe the shots of a car actually in motion, but most of the static shots are 3D models placed onto backgrounds. I don't know why, but I was surprised by this when I worked in a post house that did a lot of car commercials. One of the roles for a coworker was to get flown around to locations to take the images for the background plates using photogrammetry. "Can't fly an Alexa through the back glass to zoom in on the dash now can we" was one comment.

moron4hire · on June 4, 2023

Again, 15 years ago, this was a web app company, and Honda ain't giving any models to no used care sales anything.

rollcat · on June 4, 2023

I've built a hybrid task queue/process supervisor on top of SQL. Classical task queues like Celery didn't exactly fit our use case: a single process could run for hours or days, but in case of a node failing, it must be resurrected elsewhere as soon as possible (within seconds). I didn't have the time to re-architect everything for Kubernetes, or rewrite half the product in Erlang; so I built that weird thing. It's been super stable, running mission critical code, and making us money - for several years now.

wombatpm · on June 5, 2023

I’ve yet to find the project where Celery is the best solution, despite using it on several

takinola · on June 5, 2023

I implemented a message queue in MySQL too and it worked pretty well. Incoming messages would be written to the table and the workers would poll the database each cron period and process whatever rows were in the queue. To avoid race conditions, the workers would lock the records they were working on and then delete them as soon as the work was complete. It was simple but it worked just fine for my purposes

aa-jv · on June 4, 2023

This has been a thing since before databases were relational. 4G languages (Progress, etc.) were especially nice for their ability to wrap a queue table around a series of reversible transactions, if you coded things right .. meaning a lot of modules written for app infrastructure were based on an 'inbox table' methodology ..

scarface74 · on June 4, 2023

I’ve run into all sorts of database locking issues and concurrency issues when using a database as a queue. I saw that mistake made a long time ago and I would never do it myself.

wolfgang42 · on June 4, 2023

Database engines are getting features like SELECT FOR UPDATE SKIP LOCKED, so what were once serious blockers on this idea may no longer be as much of a problem.

wombatpm · on June 5, 2023

Is that necessary if you can just have your processes handle locking themselves?

wolfgang42 · on June 5, 2023

It’s not necessary, but it is a lot less fiddly: you automatically look at only the tasks that someone else isn’t currently working on, and because the lock is held by the database connection you get automatic retries if your worker crashes and drops the connection. You could figure out all of the interactions needed to make this work yourself, but if the database already has support built in you may as well use it (and there’s a straightforward path to migrate if you need more sophistication later).

scarface74 · on June 4, 2023

And if you skip records that you are depending on for your poor man’s queue, aren’t you just hiding bugs?

Spivak · on June 4, 2023

No? Unless there's some edge case with that statement I don't know about. That statement is basically tailor made for queues so you can select jobs that aren't currently being worked on by other workers.

Inasmuch as you trust your db's locking correctness it eliminates the concurrency issues. You can very naively have n workers pulling jobs from a queue not stepping on each-other.

avereveard · on June 5, 2023

Nuxeo unironically used a db table as a pub sub system between the cluster nodes for cache invalidations.

ljm · on June 5, 2023

It’s not so out of the ordinary. A few libraries in Rails create message queues in Postgres using advisory locks and listen/notify.

Hell, if it’s not an RDBMS then it’ll be Redis (at a much greater expense for a managed instance). I’ve seen that setup in the Ruby world far more often than using a dedicated message queue.