I wrote an implementation in Python. https://github.com/starqueue/starqueue The ...

moron4hire · on June 4, 2023

I really feel like file systems aren't used for enough things. File systems come with so much useful metadata.

I've experimented with using the file system for storing configuration data where each config value is a single file. Nested structures just use directories. The name of the file is the field name. The file extension is a hint about the data type contained within (.txt is obvious, but I also liked .bool). Parsing data is trivial. I don't need any special viewing or editing tools, just a file manager and text editor. You can see when a specific config value was changed by checking the file update time. You don't have to load the whole configuration just to access one field. And you could conceivably TAR up the whole thing if you wanted to transmit it somewhere.

I use it to configure little sub-projects in my personal website. I really like it, but I shudder to think of the complaining I'd hear from other developers if I ever used it in a work project, just because it's not whatever they've ever seen before and would require a moment of two of thinking on their behalf to get over ingrained habits.

binwiederhier · on June 4, 2023

A company I used to work for extensively used this method. It's incredibly useful to be able to read a config or state value from any language and even bash scripts quickly.

However, and this is a big drawback, once you have too many config files and you start reading and writing from different processes, you get into bottleneck situations quickly.

moron4hire · on June 5, 2023

I haven't used this system extensively yet. But I don't really see how that situation gets improved by having a single-file configuration system.

First of all, if you have multiple processes trying to read/write the same config, that's kind of suspect, and if file I/O is a bottleneck for your config system, that's a different suspicious situation. Why are your processes writing to the config so... often?

But regardless, I can't see how those problems get immediately better by storing that config in a single file. If anything, having it split across multiple files would improve the situation, as different processes that might only be concerned with different sections of the config won't need to wait on file locks from process unrelated to their concerns.

binwiederhier · on June 5, 2023

I realize it may have sounded like I was suggesting the approach. Quite the contrary: I would never do that again or suggest it. I was merely pointing out that it was quite useful at times :-) We had lots of problems similar to what you were describing. Luckily that's all in the past now.

avereveard · on June 5, 2023

Sprinkle git on top of it and you also get your config versioned and distributed centrally in lock step with your software releases

Arnie97 · on June 7, 2023

The paradigm is also used by /proc and /sys, so I guess other developers won't get confused. However I never tried to tar -x into /proc to start the same set of processes on another node, or as an alternative to /etc/sysctl.conf :)

mixmastamyk · on June 5, 2023

This was tried and called Elektra I think around Y2K. Don’t believe the idea was even new then, but there was also research into tiny file performance at the time, resulting in things like reiserfs. I think it packed tiny files into the directory itself resulting in blistering speed.

Anyway it’s an elegant idea. Silly to have dozens of config file formats when the fs already has everything it needs. We have xattr too.

The flaw on the OS level is that it is hard to get everyone to change. For new apps not a problem, and any performance concerns are no longer an issue for config.

moron4hire · on June 5, 2023

Oh man, reiserfs. Seeing that name reminds me that the original developer, Hans Reiser, is currently spending time in prison for murdering his wife. She was the interpreter during his first meeting with a Russian "mail-order bride".

https://en.wikipedia.org/wiki/Hans_Reiser

wombatpm · on June 5, 2023

That was an ugly situation. PSA: Murdering your wife/exwife/girlfriend/baby mama is never the answer. Just let it go and move on.

bboygravity · on June 4, 2023

Sounds very much like how .ini files work and/or how regedit works on windows? :p

Not a bad idea at all, but perhaps re-inventing a very old wheel?

moron4hire · on June 4, 2023

It's not at all like INI files. INI files have field names and section delimiters in them. An INI files might be:

  [UI]
  DarkMode=1
  Theme=solarized
  [Options]
  Indent=4
  InventMode=tabs

Whereas this file system config would have individual files per value.

  <user-home>/.config/<app-name>/UI/DarkMode.bool --> 1

It's a single byte file. You read the entire file contents and if it's not zero, it's true. The existence of the file tells you whether or not to use a default value. An INI file would have to be fully parsed before we know whether it contains a value for that config value.

  <user-home>/.config/<app-name>/UI/Theme.txt --> solarized

You read the entire file contents, trim leading and trailing whitespace and toLower for good measure if you want, then validate against your list of installed themes. Done. No goofy JSON or YAML parser in site.

And what reinvention is there? If you're just using a system that already exists, you're not reinventing anything.

x-shadowban · on June 5, 2023

Sort of an homage to the windows registry, but it's not a "secret 3rd thing" it's just another folder.

djbusby · on June 4, 2023

I've done and do this too. Its worked awesome for 20+ years.

moron4hire · on June 4, 2023

Oh man, symlinks to share common configuration.

don-code · on June 4, 2023

I actually did this once (over SMB on Windows, though) - and unintentionally crippled our corporate SAN with all of its polling and locking activity. I had a cluster of 20 workers which would poll every five seconds for messages, and I believe we had an EMC VNX storage appliance. I never did figure out why that was enough to bring the whole thing to its knees, but IT was very quick to track the problem back to me.

gchq-7703 · on June 4, 2023

Interesting. What makes you want to switch to the file system? I wrote one for a project[0] a while back (for MongoDB) and it didn't seem like the database introduced too much complexity. I didn't write the implementation from scratch, but the couple hundred lines of code were easy to reason about.

[0] https://github.com/gchq/Bailo/tree/main/lib/p-mongo-queue

andrewstuart · on June 4, 2023

The filesystem means zero configuration.

I found almost all message queues to be horribly complex to configure, debug and run. Even database queues require a lot of config, relative to using the file system.

I did actually write a file system based message queue in Rust and it instantly maxed out the disk at about 30,000 messages a second. It did about 7 million messages a second when run as a purely RAM message queue but that didn’t use file system at all.

It depends what you’re doing of course… running a bank on a file system queue wouldn’t make sense.

A fast message queue should be a tiny executable that you run and you’re in business in seconds, no faffing around with even a minute of config.

I just hate configuration in general.

orangepurple · on June 4, 2023

> running a bank on a file system queue wouldn’t make sense

Bank transfers are done by a text file sent over FTP (file transfer protocol; circa 1971)

https://orientalbank.com/assets/Pdfs/ach/ACH_ORIGINATION_AGR...

lmz · on June 5, 2023

Some of them are more advanced and use SFTP (circa 1997).

Izkata · on June 4, 2023

> I did actually write a file system based message queue in Rust and it instantly maxed out the disk at about 30,000 messages a second. It did about 7 million messages a second when run as a purely RAM message queue but that didn’t use file system at all.

Did you try an in-memory filesystem through tmpfs?

sitkack · on June 5, 2023

Database config should be two connection strings, 1 for the admin user that creates the tables and anther for the queue user. Everything else should be stored in the database itself. Each queue should be in its own set of tables. Large blobs may or may not be referenced to an external file.

Shouldn't a message send be worst case a CAS. It really seems like all the work around garbage collection would have some use for in-memory high speed queues.

Are you familiar with the LMAX Disruptor? Is is a Java based cross thread messaging library used for day trading applications.

https://lmax-exchange.github.io/disruptor/