I feel like it's gotten too hard to judge the size of gigabytes of RAM.
I come from a game development background where traditionally the bulk of the space usage went into media content. It felt easy to justify a game of X megabytes when 90% of that was in graphics. That doesn't seem to be where the space goes for things like this though.
To put this into perspective, the calculation I make is 1920x1080x3 (an uncompressed HD screen) = 6220800. Roughly 6 megabytes. That makes a gigabyte hold 170-ish full screens of uncompressed graphics. I'm ok with something using a gig if it's throwing that much imagery around, but if not, where does all the space go?
So, to start with, 1920x1080x4 -- even if the alpha channel isn't used, we align to pixels 4-byte words for performance (if you want to trigger panic quickly in a crowded room full of Linux graphics developers, say "Cirrus 24-bit shadowfb" out loud and see how quickly you get thrown out of the room).
But not only that -- for compositing, we need to store the full window pixmap contents for each window. That's what's necessary for things like the blur-behind effect, or antialiased window corners.
But you probably want to also have tiny previews of these windows, so you're going to need mipmap'd versions of them stored. The mip chain for a texture is just the successive power-2s-down summed together, roughly an extra 33% of memory.
And we haven't even counted things like double- or triple-buffering! In order to not see tears as the GPU draws to a framebuffer, it needs a framebuffer to scan out, and a different framebuffer to draw to. So take all of the above, and multiply it again.
This is not counting all the additional bookkeeping that apps can provide. Shoutouts to SDL for only uploading a 256x256 version of window icons -- even when the game provides a 16x16 variant, SDL will internally upscale it to 256x256 before handing it to the window manager. And you probably want to display it back down at 16x16 or maybe 64x64 for your alt-tab, so that's a full mip chain on a 256x256 texture.
Oh, and window frames! On a reparenting WM under X11, you wrap the window in a slightly larger window, and draw the window frame in the larger one. So if you have a maximized window that's using OpenGL to draw, the app has its own pair of backbuffers, then you have a window frame in its own window pixmap, which the compositor then draws to its 1920x1080 backbuffer.
You probably also want a window title on that window frame, so that means a font glyph texture, but that probably fits in a single 1024x1024 R8 texture without mips.
Anyway, these things quickly add up! I've done memory profiling like this before. There's still many, many gains left on the floor, absolutely, but I've had people plug in 3 1920x1080 monitors and then complain that the window manager was using 20MB of GPU VRAM.
You've mentioned a bunch of stuff, but even then the numbers don't add up double and triple buffering are adding just another few buffers per screen. Multiple buffers for every window still doesn't add up to the memory use levels some of these desktop environments use. 50 maximized windows with multiple buffers still ends under a gig.
Mipmapping on dynamic content is just daft. If you can generate mipmaps as they change without slowing things down then you can generate them on demand when they are needed.
The usual memory usage I was seeing was in the 200M-300M range -- this was on mutter/gnome-shell. As others mentioned, they can't reproduce the 1GB figure. But there was still a lot of noise about that, so I figured I'd explain where some of it is coming from.
> Mipmapping on dynamic content is just daft. If you can generate mipmaps as they change without slowing things down then you can generate them on demand when they are needed.
We do generate the contents on demand, but mipmapping requires the texture memory be available for the full chain. In theory, you can use sparse textures on OpenGL/Vulkan, but they have several drawbacks that make them unsuitable for compositor work -- changing the mipchain configuration for a single texture can often take 1-2ms, which wasn't fast enough for our performance targets. I did the investigation!
In the context of games: There's a lot of space used for high resolution textures which users normally never see... but they can see if they zoom in, put their nose up to a wall, etc.
I come from a game development background where traditionally the bulk of the space usage went into media content. It felt easy to justify a game of X megabytes when 90% of that was in graphics. That doesn't seem to be where the space goes for things like this though.
To put this into perspective, the calculation I make is 1920x1080x3 (an uncompressed HD screen) = 6220800. Roughly 6 megabytes. That makes a gigabyte hold 170-ish full screens of uncompressed graphics. I'm ok with something using a gig if it's throwing that much imagery around, but if not, where does all the space go?