I think this is underrated as a design flaw for how Linux tends to be used in 20...

3eb7988a1663 · 2024-12-02T02:45:20 1733107520

Actually, I have been wondering if using a Linux system as multi-user could be a boon in security.

As single user, each and every process has full and complete control of $HOME. Instead, I would prefer all applications were sandboxed to their own little respective areas with minimal access to data unless explicitly authorized. Without going full QubeOS, get some amount of application separation so my photo utility does not have permissions to read ~.ssh.

Create a user account for each application (Firefox, Email, PDFReader, etc). Run each of those applications as the foreign user account. Each application now has its own $HOME with minimal user data. Barring a root-escalation or user-separation bug, the data in your true HOME should be isolated. Even process/environment variable space should be segregated.

This also has a win in that it becomes possible to better segregate the threat model of less trusted applications. Doing granular network permissions per application is a bit hairy in Linux, but it is trivial to fully deny network access to a specific user account.

Not true isolation, but for the semi-trusted development environment, gets you a little something.

bombcar · 2024-12-02T03:55:17 1733111717

Qubes takes that to an extreme: https://www.qubes-os.org/intro/ and runs every application in a virtual machine.

rendaw · 2024-12-02T05:02:29 1733115749

I was experimenting with taking that to a median using containers in nixos. IMO the distinguishing feature of qubes is the fact that there's chrome indicating the security level of a window based on its vm - I put together https://github.com/andrewbaxter/filterway to use with window manager rules to hopefully get the same result.

alt187 · 2024-12-02T08:08:29 1733126909

Interesting! I use Guix, I wonder if the fundamental idea can be translated here too. Do you have any links for the Nix-related stuff?

rendaw · 2024-12-02T15:00:35 1733151635

Arrg, I just realized I lost my demo system in a recent drive failure. So I don't have anything I can show directly... but this is my recollection.

I used systemd-nspawn containers https://nixos.wiki/wiki/NixOS_Containers .

For each container I'd run a `filterway` process with a unique app id outside the container and mount the filterway wayland socket inside the container, then wayland programs in the container would just work IIRC (maybe needed to set an environment variable for the wayland socket, or xdg_runtime_dir or something).

I think the wayland compositor itself was running as a user, so I had some setuid commands so that the system bar launch icons could start/stop the containers as the wayland user.

IIRC wayland was pretty flexible, just mounting sockets in various places and making sure permissions were set on the socket worked great.

Some other quick notes: App ids are optional in the wayland spec, but as long as you don't run any such apps in privileged contexts (outside of a container) you can still visually distinguish those. Also IIRC Sway didn't have the ability to vary chrome based on app id - I thought I'd try to indicate the permission level in the task/system bar instead but I think other compositors do have more powerful window decoration rules.

alt187 · 2024-12-09T11:51:43 1733745103

Impressive! I'm not sure Guix has something similar to nspawn containers

Hyprland has a way to put unique borders per appid too, I think.

Propelloni · 2024-12-02T07:30:50 1733124650

You can get halfway there with Flatpak and Distrobox. Or you could take a look at some of the "immutable" distros, such as openSUSE Aeon [1].

[1] https://aeondesktop.github.io/

3eb7988a1663 · 2024-12-03T00:45:00 1733186700

Those solutions seem more aimed at keeping the system clean vs isolating what resources a program can access.

Flatpak does indeed get me part of the way there with better isolation, but available apps seem so scatter shot that I need a fallback mechanism for when there is not an official Flatpak artifact. Distrobox makes a point of indicating they are not a security boundary.

weitendorf · 2024-12-02T02:59:37 1733108377

For shared local usage or "pet" (as opposed to cattle) servers I'd agree, and in fact this is close to what Linux was designed for, since I'd consider the multiplexing lab server to also be a semi-trusted environment.

I'm referring more to how Linux is used in vast pools of "cattle" servers in Cloud, locally by eg one main user (who doesn't need multi-user but probably still needs some notion of "admin" and per-program permissions), or in a corporate setting (where the actual identity system is managed remotely). This is probably >99% of Linux environments.

SoftTalker · 2024-12-02T05:04:39 1733115879

> As single user, each and every process has full and complete control of $HOME. I would prefer all applications were sandboxed to their own little respective areas with minimal access to data unless explicitly authorized.

This is what OpenBSD's unveil does. Firefox for example only has access to ~/Downloads (and some stuff in ~/.mozilla, ~/.config, ~/.cache) in my home directory.

3eb7988a1663 · 2024-12-03T01:02:01 1733187721

Now this looks promising for mere mortals. I found jart's Linux port of pledge[0] which makes it seem possible to simply wrap utilities through a preceding script. If I couple this with distrobox/podman (which should work fine?) I might be able to pretty seamlessly lock down utilities by default with minimal shenanigans.

Assuming it does what it says on the tin, and it can work with GUI apps, this would get me almost all the way.

[0] https://justine.lol/pledge/

godelski · 2024-12-02T04:57:44 1733115464

  > Instead, I would prefer all applications were sandboxed to their own little respective areas with minimal access to data unless explicitly authorized.

You’ll be interested to learn about systemd-nspawn. You can sandbox stuff with it really easily. It is like chroot so not really resource intensive, lighter than a container.

I think a pretty useful thing you can do is boot ephemeral instances. So whatever someone does there gets undone. Useful if you’re doing system testing or CI. Because you just set up the machine once and then your scripts and whatever can do what you want. Perfect example is when trying to test install scripts.

Though this is also kinda the point of flatpak and snap. Though these are controversial in the Linux community. Then again a lot it people dislike systemd, though fewer than originally.

https://wiki.archlinux.org/title/Systemd-nspawn

3eb7988a1663 · 2024-12-03T00:48:36 1733186916

The nspawn does look interesting, and potentially exactly what I want. Although, this wiki page is dense enough that I am concerned I am going to somehow misconfigure it and be less secure than I would be without using it.

I Flatpak wherever I can, but several of my required applications are not first-party packaged, which makes me extra squeamish about installing them.

godelski · 2024-12-03T18:11:32 1733249492

For security read on systemd-exec https://www.freedesktop.org/software/systemd/man/latest/syst...

RulerOf · 2024-12-02T08:40:36 1733128836

I read a good chunk of that wiki link, but didn't really come away with an understanding of how it differs from just using Docker for sandboxing an app.

Did you have any insight there you might share?

xorcist · 2024-12-02T09:31:31 1733131891

It differs by not being insane. Trivial functionality that actually works. It's what's good about systemd.

It doesn't require forwarding sockets or giving free access to root just for building images. It doesn't explode just because you touch your nftables rules. It doesn't suddenly expose a process to the Internet because of some undocumented option. You can use all the normal tools such as auditd and SELinux without having your configuration overwritten by a madman.

e145bc455f1 · 2024-12-02T15:56:33 1733154993

How is it different from podman then?

xorcist · 2024-12-02T16:10:15 1733155815

It's not a docker replacement. Use podman to replace docker. Use system to start stuff (in a namespace or otherwise).

godelski · 2024-12-02T17:51:58 1733161918

  > how it differs from just using Docker

It uses the system.

You’re missing the trees for the forest. At a high level they are the same, just as with LXC or podman or others. But it’s the details which are really important. Because your leveraging the system you can really shrink down the size, another user mentioned. But there’s also a convenience in just being able to use systemd when its already built into your system.

I suggest also reading

  man systemd-nspawn

Just type it into your terminal, you don’t need to install anything

guappa · 2024-12-02T09:13:32 1733130812

It's basically the same as docker but it doesn't use proprietary cloud stuff such as dockerhub.

Also occupies like 100kb instead of 40mb because it's C and not go.

jolmg · 2024-12-02T04:02:58 1733112178

That's already what's typical with services, e.g. postgresql.service specifies User=postgres.

guappa · 2024-12-02T09:12:26 1733130746

If only firejail existed to do exactly that… it could come packaged in most distributions with sensible profiles that are easy to enable.

It might even have a github project that people might reach, something like this https://github.com/netblue30/firejail

maccard · 2024-12-02T11:32:07 1733139127

Have you used containers?

amelius · 2024-12-02T09:31:14 1733131874

Yeah that's possible but its still security as an afterthought, bolted onto an existing system.

And it's a bad UX also. As a user, I don't want to deal with fake users, for example.

3eb7988a1663 · 2024-12-03T01:04:58 1733187898

*Nix was designed to be multi-user. It's probably the only security boundary that was present from nearly the start. I think there are some rough edges on the user-per-application model, but it should all be scriptable so that the machinations are mostly hidden.

cedws · 2024-12-02T02:58:05 1733108285

I agree that multi-user should go away for modern server workloads, however, users are used as a blast door. Mainly because Linux's security model is lacking. systemd for example commonly runs services under separate users to make it more difficult for a compromised application to elevate privileges. Android does something similar AFAIK.

Users should have never became a security boundary to isolate applications, but they unfortunately have, and there's not really an alternative.

weitendorf · 2024-12-02T03:19:33 1733109573

This is why I think multitenancy is the more important problem (though both are related), because it's the key to solving shared-kernel application permissions without "users". Containers were a step in the right direction but aren't a sufficient security boundary in themselves - what is currently handled by the "container runtime"/sandbox needs to be built into the kernel IMO.

goodpoint · 2024-12-02T08:20:45 1733127645

> Linux's security model is lacking

It's not lacking at all. The root + users model is common not only across OSes but also all sort of physical devices.

moring · 2024-12-02T09:45:47 1733132747

Linux's security model doesn't become better just because everybody is doing it that way, and besides that, everybody is doing it because they are copying Linux.

goodpoint · 2024-12-02T14:05:55 1733148355

> Linux's security model doesn't become better just because everybody is doing it that way

I didn't claim it does.

> everybody is doing it because they are copying Linux.

This is not true. The model existed decades before Linux.

stubish · 2024-12-06T02:09:10 1733450950

Nah, its been lacking since inception, with people trying things like chroot jails and suid bits decades before Linux was a twinkle in an eye, and we still regularly fail at running untrusted code.

e145bc455f1 · 2024-12-02T16:38:39 1733157519

>systemd for example commonly runs services under separate users

Doesn't this have to be manually setup. Can i make systemd to run a service under a temporary user automatically.

zokier · 2024-12-02T19:31:30 1733167890

Yes?

https://www.freedesktop.org/software/systemd/man/latest/syst...

mikepurvis · 2024-12-02T02:06:32 1733105192

My impression was that all the hosted k8s providers are doing multitenancy with if not full per-customer VMs, at least additional abstractions like gVisor.

Are there some that aren't? Or are you referring here more to untrusted/shared in the sense of platforms like Github Actions just running everyone's different loads on the same pool of kernels?

weitendorf · 2024-12-02T02:42:07 1733107327

Right, I'm talking about shared-kernel multitenancy. Shared-kernel multitenancy isn't just about reducing the OS overhead from host + one or more VMs (or sandboxes) to just a single host, it's also about not having to continually start and stop virtual machines/sandboxes, which introduces its own resource overhead as well as a latency hit (which essentially always coincides with resource pressure from increased usage, since that's why you're scaling up) every time it's done. Also, even VMs and sandboxes don't really protect against DoS/resource fairness/noisy neighbor problems that well in many cases.

Why does this matter? Incurring kernel/sandbox boot overhead on cold start/scaling makes it so that services have to over-provision resources to account for potential future resource needs. This wastes a lot of compute. I also think it's incredibly wasteful for companies to manage their own K8s cluster (if K8s supported multitenancy you'd probably want only one instance per-datacenter, and move whatever per-cluster settings people depend on to per-service. This is also much closer to "how Google does things" and why I think Kubernetes sucks to use compared to Borg), again because of all the stranded compute, and also because of the complexity of setting it up and managing it - but without shared-kernel multitenancy, multi-tenant K8s has to employ a lot of complicated workarounds in userspace. Or you can use a serverless product, ie pay for someone else's implementation of those workarounds, and still suffer some of the resource/latency overhead from lack of shared kernel multitenancy.

This is one of the problems I want to address with my company BTW, but it would take years if not decades to get there, which is why I'm starting with something else.