I think a lot of the arguments I’ve seen stem from “Unix philosophy” style arguments. Also, historically the systemd project has been quite hostile to user requests in some cases, which broke existing workflows for people.
I personally think the basic “service manager” functionality works pretty well and I like systemd overall.
However, the same is not true for a bunch of the more peripheral services (e.g. resolved, networkd, timesyncd). What’s particularly annoying is that there exist mature open source projects that can provide better functionality in a lot of cases which could be used, rather than bundling a half-assed attempt with systemd (eg. unbound, chrony).
None of these are mandatory though. It's up to the distros whether to use them. For example at this point resolved is pretty commonly enabled by default, networkd not at all, and timesyncd is perhaps 50-50 with chrony.
Yes, but not using those seems to defeat the point of using systemd.
The most convincing advocacy I have seen for systemd (by a BSD developer, its fairly well known) is that it provides an extra layer of the OS (on top of the kernel) to standardise things.
If you don't use them, you still have a standardized way of managing system services, including scheduled batch jobs. The other services are a convenient and integrated way of getting a basic version of that functionality, but they are by no means the entire point of systemd.
Apart from the comparison to NTDLL making no sense, you'd be wrong about the tools too. It aims to provide a basic standardized way to do most system-level management (set up networking, do DNS resolution, get a decently synchronized system clock) but it absolutely does not aim to replace more specialized tools like chrony or NetworkManager. For example, timesyncd doesn't do NTP or PTP, only SNTP.
systemd (the PID 1), journald and udev are different, though, because they're mandatory; and there's no alternative to logind while technically optional.
How so? The functionality of the kernel and of systemd are basically orthogonal: neither really provides features that the others do. There are some interfaces in the kernel which are basically intended to be delegated to one userspace process, like cgroups and the hotplug management done by udev, so talking to them directly as an application or library is probably not a good idea and those processes provide a means of co-ordinating that, but that's a kernel decision and the udev one predates systemd.
Your remark may suggest that everything MS does is “wrong”. This is of course an extreme overstatement and all of their approaches should be evaluated on their own.
It's a declerative boot system, where units can declare their dependence on another unit/service, and they will be brought up in accord, with a potential for parallelization, proper error handling, proper logging, all while the user complexity is a trivial .ini-like file.
It also has a good integration with Linux features, like cgroups, allowing easy user-isolation, has user-services that allow proper resource management, and many more.
systemd is just much better for managing cross-cutting concerns. E.g. having the machine wake up from sleep for a timer and do a task (e.g. backup or updates) which delay sleep is trivial, portable and reliable with systemd and probably technically possible without it.
Sorry, I'm not deep enough into it to determine why systemd would be better. Perhaps it's not. I know it's less unix-y, but we shouldn't treat that as holy imho, it's a good guideline though, I try to follow the unix philosophy.
I personally think that most of the OS decisions done by Microsoft are wrong and should be killed with fire but also I'm far from thinking bad about the people who made these decisions. Perhaps these decisions make sense from their standpoint given the information they had a time.
On the other hand on Linux I'd really wish for a centralized settings repository like Registry. But not from the systemd crowd, of course.
resolved and timesyncd are intended to be removed by the end user of the operating system. Sort of like a transparent peel is intended to be removed by the buyer of a new shiny device.
I haven't seen anything worse than those two from systemd crowd.
Resolved is useful if you frequently change networks (e.g. a laptop), particularly if you’re also using an overlay such as Tailscale. Unfortunately, like anything related to systemd it is flakey and buggy and the solution to most things is to restart it.
I started using Linux because I liked stability. Systemd makes a Linux system as dynamic as a Windows one (which is nice) at the price of making it as stable as a Windows one (which is not).
I believe that part of the problem is that it is written in C, which is an absolutely terrible language for reasoning in. Writing it in Lisp would have been smart. Writing it in Go nowadays would probably be okay. But C is just an unmitigated disaster.
> Resolved is useful if you frequently change networks (e.g. a laptop)
I can see the case. I use it myself on a laptop. And by "use" I mean that I just gave up looking into Linux desktop networking like a decade ago. It works, fine.
> I believe that part of the problem is that it is written in C, which is an absolutely terrible language for reasoning in. Writing it in Lisp would have been smart. Writing it in Go nowadays would probably be okay. But C is just an unmitigated disaster.
... this is ... incredibly wrong. Hardest disagree on all points here.
iTerm2 never enabled any AI features by default (it always required an OpenAPI key, which the user had to provide). The backlash was for including an AI related feature in the default build at all.
Following the backlash, I think they made it an optional plugin.
Using time to sync between computers is one of the classic distributed systems problems. It is explicitly recommended against. The amount of errors in the regular time stack mean that you can’t really rely on time being accurate, regardless of leap seconds.
Computer clock speeds are not really that consistent, so “dead reckoning” style approaches don’t work.
NTP can only really sync to ~millisecond precision at best. I’m not aware of the state-of-the-art, but NTP errors and smearing errors in the worst case are probably quite similar. If you need more precise synchronisation, you need to implement it differently.
If you want 2 different computers to have the same time, you either have to solve it at a higher layer up by introducing an ordering to events (or equivalent) or use something like atomic clocks.
Fair, it's often one of those hidden, implicit design assumptions.
Google explicitly built spanner (?) around the idea that you can get distributed consistency and availability iff you control All The Clocks.
Smearing is fine, as long as it's interaction with other systems is thought about (and tested!). Nobody wants a surprise (yet actually inevitable) outage at midnight on New year's day.
You can delegate the choice to the terminal by using ANSI color codes [1]. Then the onus is on the user/terminal developer to make sure the colors they’ve configured (or the defaults provided) are reasonable.
A downside of this is that it is quite restrictive, there are only like 8 colors.
EDIT: I missed the “regardless of the color map” bit, that is a bit unreasonable. Either you trust the terminal emulator or don’t. I think trying to have it both ways is too much.
For example, I've seen a program that sets foreground color to a symbolic (not RGB value) yellow, and doesn't set background. While that combination might be legible on some terminals, it's definitely not on all of them. Don't assume that the user's terminal's color map makes all combinations legible.
What combinations can be assumed to be legible? I think if a terminal user has their colors configured so that some of the 8 ASCII colors aren’t legible on their background, that’s on them, with the exception of white and black.
It seems like the only way to satisfy your ask is to not use color at all.
The default configurations of most terminals includes illegible color combinations.
So I think it's not on every user to somehow design optimal color palette settings on their computer that work in all combinations (if they even can), but rather on the developers of software not to say "Hey, I bet every yellow would be legible on white" or "Yolo, I bet yellow is legible on every background, so I'm just going to set this foreground to yellow and not set background at all."
> The default configurations of most terminals includes illegible color combinations
This hasn’t been my experience - while I’m not familiar with a wide breadth of terminal emulators, all the ones I’ve used have a default black background with the ANSI colors being very bright, making them clearly visible. I would again say that if a terminal emulator has some of the standard ANSI colors set to not be visible on their default background, that is the terminal emulator’s problem, as it is clearly undesirable.
And of course, once a terminal program starts changing the background color then it can’t make any assumptions about which of the user’s colors will be visible - which is why, as you say, the background color should not be changed without a very good reason. If the bg is set, it should be very easy to switch it to either a “dark mode” or “light mode” to make colors visible.
But some assumptions must be made in order to make any use of color, and “the 6 standard ANSI colors (red, green, yellow, magenta, cyan, blue) are visible on the user’s background” seems like it has to be the safest assumption.
I am in support of terminal programs respecting a universal configuration to disable color: https://no-color.org/
Safe signal handling has so many footguns that it seems worth re-considering the entire API.
Even OpenSSH has had issues with it [1].
It seems very difficult to build good abstractions for it in any programming language, without introducing some function colouring mechanism explicitly for this. Maybe a pure language like Haskell could do it.
Haskell's runtime is so complex that I don't think you can write signal handling functions in Haskell. The best you can do is to mark a sigatomic boolean inside the real signal handler and arrange the runtime for check for that boolean outside the signal handler.
Yup: see https://hackage.haskell.org/package/ghc-internal-9.1001.0/do... where it is clear that setting a handler simply writes to an array inside an MVar. And when the signal handler is run, the runtime starts a green thread to run it, which means user Haskell code does not need to worry about signal handler safe functions at all, since from the OS perspective the signal handler has returned. The user handler function simply runs as a new green thread independent of other threads.
But I like the fact that you brought up this idea. Haskell can't do it but in a parallel universe if there were another language with no runtime but with monads, we can actually solve this.
I am not sure but I think rust already allows safe signal handlers? The borrow checker makes you write thread safe code even without any active threading and signals are just emergency threads with some extra limitations... right? I don't understand this too deeply so I could be wrong here.
Rust does allow for safe signal handling, but it's sort of the same way that it allows for safe and correct interrupt handlers for people writing os kernels (signals are basically interrupts, just from kernel->user instead of hardware->kernel). You're basically constrained to no_std and have to be very careful about communications with the rest of the system using lock free mechanisms.
If handling a signal was equivalent to handling concurrency then it wouldn’t be as much of a problem.
IIRC a signal can take over execution of a running thread, so it will completely invalidate any critical sections etc. You cannot access any shared resource easily, cannot allocate memory and a lot more restrictions of this form.
Yes but the signal handling code acts as if it is on a different thread. So it cannot access the critical sections or mess up any existing state on the thread anyways. Sure the other parts need to be managed manually but just that should still go a long way. ...Right?
Not quite, by default the signal handler hijacks an existing thread. It is possible to keep a dedicated thread around that solely waits for signals, but that’s a workaround and you end up needing to also mask all signals from all other threads for correctness. And then there are also synchronous signals, which can’t be handled this way (eg. segfaults)
Imagine a scenario where the original thread state is in a critical section, in the middle of allocating memory (which may need a mutex for non-thread local allocations) etc.
The code within the signal handler can’t guarantee access to any shared resource, because the previous execution of the thread may have been in the middle of the critical section. With normal concurrency, the thread that doesn’t hold the mutex can just suspend itself and wait.
However, because the thread has been hijacked by the signal handler, the original critical section cannot complete until the signal has been handled, and the signal handling cannot yield to the original code because it is not suspendable.
Signal handling is distinct from a different thread because it blocks the execution of the “preempted thread” until the signal handler completes.
As a example, if the preempted code grabs a lock for a resource, then signal handler completion can not depend on grabbing that lock because that lock will never be released until the preempted code runs again and the preempted code can never run again until the signal handler completes.
A correct signal handler can never wait for a resource held by regular code. This precludes coordination or sharing via normal locks or critical sections.
The best thing you can do is set a global variable value and that’s it. Let your main even loop mind the value and proceed from there. Only do this in a single thread and block singles in all others as the first thing you do. Threads and signals do not mix otherwise.
You want signalfd, which may optionally fed to epoll or any of the other multiplexing syscalls.
Signalfd can mostly be implemented on any platform using a pipe (if you don't have to mix 32-bit and 64-bit processes, or if you don't need the siginfo payload, or if you read your kernel's documentation enough to figure out which "layout" of the union members is active - this is really hairy). Note however the major caveat of running out of pipe buffer.
A more-reliable alternative is to use an async-signal-safe allocator (e.g. an `mmap` wrapper) to atomically store the payloads, and only use a pipe as a flag for whether there's something to look for.
Of course, none of these mechanisms are useful for naturally synchronous signals, such as the `SIGSEGV` from dereferencing an invalid pointer, so the function-coloring approach still needs to be used.
> Another option is to use a proper OS that includes the ability to receive signals as a part of your main event loops
Every 'nix can do that. Your signal handler just writes a byte to a pipe and your main loop reads the pipe or fifo. The pipe/fifo is your event queue, which your main loop reads.
> The best thing you can do is set a global variable value and that’s it.
Seems kinda limiting.
If I've got a slow file download going on in one thread, and my program gets a Ctrl+C signal, waiting for the download to complete before I exit ain't exactly a great user experience.
Use select() or epoll() or kqueue() to see if your socket is ready for reading. That way you can monitor your global variable too. That’s the correct way to do it.
If you have multiple threads, you start one just to mind signals.
Signal handlers are extremely limited in what they can do, that’s the point. They are analogous to hardware interrupt handlers.
In fish-shell we have to forego using the niceties of the rust standard library and make very carefully measured calls to libc posix functions directly, with extra care taken to make sure so memory used (eg for formatting errors or strings) was allocated beforehand.
Or it's nearly impossible for a pure functional language if the result of the async signal means you need to mutate some state elsewhere in the program to deal with the issue.
I think that’s slightly orthogonal. It would still be safe, because you’d design around this restriction from the start, rather than accidentally call or mutate something you were not supposed to.
The problem with safe signal handling is that you need to verify that your entire signal handler call stack is async safe. Assuming purity is a stronger property, signal handling is a safe API without any more work.
The inflexibility due to the purity might cause other issues but that’s more a language level concern. If the signal handling API is safe and inflexible, it still seems better for a lot of use cases than an unsafe by default one.
Monads can be thought of as arbitrary function colourings, hence the prior mention of Haskell potentially being a good fit. Of course monads are implementable in almost any other language, but few have as much syntax sugar or general library support as Haskell does, except maybe Ocaml
Yeah, but how do you design a Monad that does the "tell this other thread to unblock and unwind its state because an external error has triggered? You know, the basic function of an interrupt?
1) Monads used to restrict the computation available in the context of a signal handler (or function coloring etc, basically a way for a compiler or static checker to determine that a block of code does not call unsafe functions)
2) The actual process of handling a signal received by the signal handler
I think me and the parent are referring to 1). 2) is also important, but it is not a signal specific concern. Even without a signal handler, if you want to write an application which handles async input, you have to handle the case of processing the input to do something useful (eg. let’s say you are writing an HTTP server and want to have a network endpoint for safely killing the thing).
I think the generally recommended way to represent 2) in a pure way is to model the signal as a state machine input and handle it like all other communication.
Whilst YAML is an option, if the choice is between having the unnecessary extra features of JSON5 or YAML, JSON5 seems like the clear winner.
Allowing multiple types of quotes may be an unnecessary feature but it is a clear lesser evil compared to the mountain of footguns that YAML brings with it.
How does defining a YAML resource strictly in terms of well-formed JSON + octothorpe comments introduce "the mountain of footguns that YAML brings with it"?
It doesn’t, quoting strings does solve almost all issues, but it does leave potential footguns for the future.
If you don’t enforce it, in the future the “subset of YAML” property might get weaker, especially if someone else is modifying the config.
If you treat config files the same as code, then using a safe subset of YAML is the same as using a safe subset of C. It is theoretically doable, but without extensive safeguards, someone will eventually slip up.
The note of several of them as being defaults for a web app kind of sets expectations for me. These are defaults for maybe you're writing a web app and want it to be a bit more like MySQL or Postgres. They aren't defaults for using SQLite as an in-memory cache on a complex piece of analysis.
SQLite is an amazingly widely used piece of software. It's impossible for one set of defaults to be perfect for all use cases.
I agree. While it’s probably not possible to settle on defaults that work for each and every scenario, my personal preference is that factory defaults should tend to optimise for safety primarily. (Both operational safety, but also in regards to usage.)
For example, OP suggests setting the `synchronous` pragma to `NORMAL`. This can be a performance gain, but it also comes at the cost of slightly decreased durability. So for that setting, I’d feel that `FULL` (the default) makes more sense as factory default for a database.
I agree that they are tradeoffs and I find it strange to use a term like "sensible" to describe them. In my mind, it implies that if you're not using them you're not being sensible. And in my experience, people who can't explain that these are tradeoffs often use words like that[0]. Of course English is not my first language so that may be the reason why I think that.
These may be useful settings for a certain kind of application under a certain workload. As usual, monitor your application and decide what is suitable for your situation. It is limiting to think in a binary way of something being sensible or not.
[0] Like "Best Practices". Any else you can think of?
That's the essence of defaults: they're just better defaults, not an excuse to stop caring about what better settings to use once you have runtime performance metrics to guide whether they need changing and which direction to change them in.
Right, but they are not the official defaults. If you are changing the official defaults, then you might as well do it in a more principled way.
If you are going to need to optimise with performance metrics anyway, then why not stick to just the official defaults (unless the official defaults are non-functional, is that the case?)
I think anyone who is reading a blog post on “better defaults” is front loading some of the optimisation, so you could let them make a principled choice straight away for marginal extra cost.
There is some principle here: these are more sensible defaults (but still only defaults) for a very different purpose when compared to the defaults that make sense for "generic SQLite use" that has to keep those set to something that works across all versions. Having different, domain-specific defaults based on an assumption of "you're setting up a new project using the current version of SQLite" makes a whole lot of sense.
A lot of these aren’t defaults because of backwards compatibility. IMO there is no reason to not use WAL mode, but it’s not default because it came later
As long as all the processes are on the same machine the wal mode is a good idea.
It's not good in the case where multiple machines are sharing the same database. Like say if you had a shared settings file which allowed multiple VMs to be set in one place.
Obviously the same-machine situation is the most common. But you asked for a reason for when wal is not appropriate.
> If you are going to need to optimise with performance metrics anyway, then why not stick to just the official defaults (unless the official defaults are non-functional, is that the case?)
Well, why don't you do the research and tell us?
In my 10-15 years of dealing with official defaults of many programs is that they do work, but in 90% of cases they are overly-conservative.
I don’t think it’s necessary to agree completely. You could start by codifying a minimal set of things that the majority of people agree on (user data sanitisation, authentication handling etc) and then build on it over time.
The standards could also help codify more meta things, like vulnerability policies, reporting and outages. This would be helpful to form a dataset which you can use to properly codify best practices later.
The main problem is that this increases the bar for doing software development, but you can get around this by distinguishing serious software industries from others (software revenue over a certain size, industries like fintech, user data handling etc)
Well, I can see how someone would naively think that was true, because only a fixed set of events have already happened in the past, so there is a correct answer.
I think it also stems from the way it is taught in schools, where there is a lot of focus on memorising dates and events etc, rather than on the process of actually deriving them from sources of questionable trust.
Also, the majority of focus in schools (in the UK) is on much more modern history and doesn’t really focus too much on the really ancient stuff and the extra difficulties that arise from learning about it.
it’s kind of odd though to think about a kiddo learning history as the evidence allows it to be unfolded.
they’re brand new to being a human, and even then they aren’t adult humans (i guess defined as such, post facto)
seems like our brains are craving hard structural information to establish requisite coherency once fully ‘weened off’ by our family unit. so things are taught in the traditional scholastic type of way first, and then introduced to more scholarly approach later, revealing who is behind the curtain in oz.
> don’t see the use case for first encrypting and then giving the key to the forge
You might only want to keep the files secret from the broader internet, rather than a (trusted) forge, which can make the unencrypted secrets available via an UI for eg. authenticated maintainers.
These cases would benefit from being able to see the secrets transparently in the UI when logged in.
Also, forges having access to e.g. deployment secrets is not that uncommon (e.g. GitHub with deployments via GitHub actions).
Not exactly. A private repo is entirely private. You might have an open source project which has some deployment secrets etc that you want to check in to git. All the code and config is perfectly safe to expose to the internet, but you want to hide a specific file.
I think this pattern was common for things like Ansible and Terraform configs and dotfiles.
We started using git-crypt hurriedly and out of necessity/paranoia from the time we heard that our GitLab instance could've been compromised due to a "anyone can be admin" class exploit that had just been disclosed. This must've been a fair few years ago.
I personally think the basic “service manager” functionality works pretty well and I like systemd overall.
However, the same is not true for a bunch of the more peripheral services (e.g. resolved, networkd, timesyncd). What’s particularly annoying is that there exist mature open source projects that can provide better functionality in a lot of cases which could be used, rather than bundling a half-assed attempt with systemd (eg. unbound, chrony).
reply