If you're running a modern Linux desktop you're probably running Wayland, and screencasts on that have long since been a complete pain in the neck with per-compositor "solutions" that mostly don't work quite right. Fortunately someone who works on Gnome wrote the obs-xdg-portal plugin that should fix this, at least for Gnome and hopefully soon for wlroots and KDE once they fully support the underlying portal API. Until then, the easiest way to get screencasting working is just to run in X11.
(Ask me about ffmpeg raw GPU buffer capture one day; running a bunch of codec code as root is always exciting.)
Does Wayland do anything better than Xorg? Every time I see it mentioned, it is about how it does not support this or that core feature of Xorg (e.g multiple displays with custom pixel densities/scaling, screen sharing apps being broken, etc...). What is Wayland's reason for existing?
Same reason systemd exists, the previous solutions were old and clunky and some people got fed up and decided to update. Plus some good old NIH. People who were dealing just fine with previous solutions then start complaining about breaking things for the sake of breaking things and not being able to do things the way they were used to.
In the specific case of Xorg I find the situation strange because I'd gladly have made the switch 15 years ago back when messing with Xorg.conf was a common occurrence for me and it kept getting in the way (although a big portion of the blame was with the proprietary drivers, especially AMD's). Xorg was sometimes a bit of a pig too resource-wise, but that's when I was running a PC with 256MB of RAM. I remember being fairly optimistic when I first heard about Wayland and Mir, the prospect of ditching X11 was enticing.
But now? I haven't really had to wrestle with X in a long time. It just works for me. I'm definitely not looking forward to reworking my entire workflow for minor benefits although I suppose I'll have to one day. I also use X forwarding pretty extensively, but I'm probably a small minority these days.
I agree with this. Wayland had taken so long and is still lacking in a few areas, while Xorg somehow managed to reach a point where it actually "just works" first try, with some minor problem every other year. I don't need Wayland anymore.
I'm doing my classes online with Zoom. For whatever reason, under Wayland I can not choose individual windows to share--I can only share the whole desktop. I switched to Xorg and now I can share individual windows in Zoom. I honestly don't care to support proprietary software, but application writers must do extra work to support both.
Wayland by default prevents applications from accessing other application display, input and output, while on X it is basically free for all. Any application can see what any other application is displaying, read it's input & send it it input events.
This might have been fine in the past, but is not really OK any more with efforts to make things more secure (eq. to prevent a malicious application to read you password entry, make screenshots of sensitive data, inject input events to your secure sessions, etc.).
The side effects is that new protocols need to be developed that applications can use to request access to display/input/output of other applications in legitimate cases (such as screen sharing in your case) & not all is in place for that yet.
> The side effects is that new protocols need to be developed that applications can use to request access to display/input/output of other applications in legitimate cases (such as screen sharing in your case) & not all is in place for that yet.
This is the main hurdle for most people. I would say that 99% of people agree that the Wayland way makes sense and is he better way of doing things but without the needed access controls it just not ready yet.
Like if Google said "Apps can't access [location|files|whatever] without permission" on Android with no way to grant those permissions.
>Wayland by default prevents applications from accessing other application display, input and output,
So it breaks the entire linux philosophy of using input and output streams to pipe data between different modular applications?
>This might have been fine in the past, but is not really OK any more
Says who? Personally I like my computer being able to access other things on my computer. It kind of makes it more useful that way. The ability for applications on linux to fairly seamlessly work together using a set of standard protocols is one of the primary reasons I use it.
> So it breaks the entire linux philosophy of using input and output streams to pipe data between different modular applications?
Not really. To use your analogy, the way that X works - every application being able to read the framebuffer of any other - is the equivalent of every application running as root and being able to read and modify any file on the system. When you consider that applications running under Wayland may include e.g. banking details, any app being able to read that is like anything being able to read /etc/shadow.
If your computer is perfectly secure, with no untrusted code running, that's great - and also far more secure than 90% desktop computers out there.
On many systems, and by default, yes - but the other part of what's going on is that Wayland allows applications to be sandboxed like they couldn't be before, as they can no longer use your X server as a conduit to spawn an unsandboxed shell and run commands. You can, today, run e.g. Firefox in a sandboxed environment and be certain it can't access anything you don't want it to.
AFAIK graphical application disttribution/sandboxing systems such as Flatpak pretty much require this to be avaialable if they ever want to provide reasonably secure sandboxing & might be already making use of this on Wayland systems.
Arguably the end state planned for Wayland in this regard (having access to specific applications provided) is conceptually closer to streams than the current situation with X (one big shared ball of global state).
Not really - I should have bean clearer - by input i mean keyboard input and its manipulation.
You can pipe stuff to other executable all you want under Wayland, you might just not easily (eq. user granting permission using the correct protocol) inject keyboard events from on application to another (sey malware masking as a game injecting code in the form of keyboard events into a running terminal emulator or ssh client).
Nitpick: This is not a meaningful comparison. Wayland is a wire protocol. Xorg is a display server, and the wire protocol it implements is called X11. There are several other display servers that implement the Wayland protocol. Some of these display servers do support those core features, and some of them don't (yet). It depends on which one you're using. The display server used by GNOME should support those features.
>What is Wayland's reason for existing?
From the website [0]:
>Wayland is intended as a simpler replacement for X, easier to develop and maintain.
wayland may be easier to maintain than X. But from what I've seen writing a compositor for wayland is more difficult than writing an X window manager, because things you got for free with X have to be implemented by the compositor in wayland.
And many applications are more difficult to target wayland, because there aren't (at least yet) standard protocols for things like screenshots, screencapture, etc. So they have to either choose one desktop environment to target, or have implementations for all of them.
Take a look at wlroots [0] for a library that massively simplifies the task of writing a wayland compositor. It also gives many of those lower level things "for free". For an even higher-level API built on top of wlroots, you can look at wltrunk [1].
There are standard APIs for screenshots and screencapture, implemented through the desktop portal and pipewire. Check the top-level post for more info about this -- it's part of why Wayland support for OBS has progressed.
I'm aware of wlroots. But KDE, GNOME, Enlightenment, etc. don't use it, so each of those have to implement things separately.
Concerning the desktop portal API. It's basically just a wrapper around the native custom API's of the underlying compositor. And it is pretty limited in functionality. For example, the screenshot API just has a way to request a screenshot, it doesn't have a way to specify that you would like to select a window, region, or display/screen/monitor. In the case of wlr-portal, from what I could tell it just always gives you a screenshot of the full desktop.
>But KDE, GNOME, Enlightenment, etc. don't use it, so each of those have to implement things separately.
I am not sure how this is relevant if you're trying to write your own compositor. If those projects want to create extra work for themselves, that's on them.
>the screenshot API just has a way to request a screenshot, it doesn't have a way to specify that you would like to select a window, region, or display/screen/monitor.
Yes, that's on purpose. What's supposed to happen is that the portal daemon (NOT the application) pops up a dialog asking the user to choose which one they want. Unfortunately the wlr portal is still not done yet and doesn't implement this.
> I am not sure how this is relevant if you're trying to write your own compositor. If those projects want to create extra work for themselves, that's on them.
I'm actually more concerned about the fact that wlroots has/had to duplicate work done by Gnome and KDE (wlroots is more recent than much of gnome and kde's wayland support).
> Yes, that's on purpose. What's supposed to happen is that the portal daemon (NOT the application) pops up a dialog asking the user to choose which one they want. Unfortunately the wlr portal is still not done yet and doesn't implement this.
Yeah, the problem is that each compositor has to implement it's own screenshot dialog, and you _have_ to go through that dialog for that compositor. So on wlroots, currently, an app can only get a full screen screenshot. And a tool like flameshot becomes awkward if the compositor opens it's own dialog. In X, if you don't like Gnome's screenshto tool, you have a handful of other options. With wayland, tough luck, the most you can get is a better editor/annotation tool.
>I'm actually more concerned about the fact that wlroots has/had to duplicate work done by Gnome and KDE
I don't think so, GNOME and KDE have never had the goal of making a reusable and generic compositor library like wlroots. You can try to build something with their internal compositor libraries (libmutter or kwayland) but they probably won't be as nice.
>The problem is that each compositor has to implement it's own screenshot dialog, and you _have_ to go through that dialog for that compositor.
This is on purpose and it's not the problem. It's the only way to do it securely. The problem is that you are trying to perform a privileged operation, which is the only way that something like flameshot can even work. Allowing random unprivileged programs to scrape your screen without confirmation is how you get trojans and other spyware. It's not worth adding more APIs to the portal just to support this because it's intended to be a secure API that can be accessed from within sandboxed applications.
Sure there are other tools on X but unfortunately none of those options are secure either.
> This is on purpose and it's not the problem. It's the only way to do it securely. The problem is that you are trying to perform a privileged operation, which is the only way that something like flameshot can even work.
That's not true. One way is to have secure protocols that can only be used by whitelisted programs in a secure context. sway has something like this (although by default I think it is pretty open), but there isn't any kind of standard mechanism for privileged protocols in wayland.
Also, I don't see why the screenshot API couldn't take a value for the type of screenshot to take. Like an enum with values for Region, Window, Screen, Full, and Any. To hint at what kind of screenshot to prefer.
Yes, one way to have a whitelist is to pop up a dialog asking to approve elevated permissions for a certain application. This is what mobile operating systems already do. The security implementation in sway is incomplete and has stalled, and is not going to work for all other types of desktop anyway. Pluggable security configuration should probably be added to wlroots at some point. This would allow any compositor to implement their preferred security policy and support whatever MACs or auditing they need.
It actually works in a way which makes sense for modern compositors and GPUs, which means the rendering is much smoother without tearing issues and so on. Issues with getting this to work reliably in Xorg is what lead to the maintainers abandoning it to work on a replacement. It just turns out shifting a bunch of software built around a core and complicated interface to another system is quite difficult.
I'm kinda new to Linux as a desktop, and thus went straight to Wayland, so these kinds of comments from ol'-timers are super interesting to me.
I run a Wayland desktop, and I start it by typing it's executable from the TTY after I log in. No fuss, no muss.
Everything works great, except there was this one game I wanted to try out that's a Windows .exe and needs to run in Wine and I couldn't quite get it to run in Wayland. So I installed xorg-server and an X window manager. Tried to just run it from TTY and it complained that there was no X server running. Okay, turns out I need another program to start X, then start my window manager, as a kind of desktop chaperone. Finally get that worked out, try running my game, and the screen tearing is a nightmare. So now I have to run a compositor in there as well to be an intermediary in the already extremely complicated X protocol. And since X needs to run as root (I think?), half the time I try to start it, I get odd permissions errors, or it tries to use the wrong TTY. As someone going the _other_ direction, I can't fathom how anyone puts up with X.
The good news, is that after it did it's initial setup and install in X, the game now seems to run fine in Wayland. :D
X11/Xorg was the default on many distros, so often it was preconfigured in a working state. I started my Linux journey around...2004? And booting into Mandrake or Slackware (or a Knoppix Live CD, my true beginning), X would work fine. But as soon as I had to install it myself (minimal Fedora, minimal Debian, or my fave, stage 1 Gentoo), I'd hit all kinds of issues with configuration and starting the X server.
As a counterpoint, I tried to set up Wayland a couple years back on Ubuntu and Fedora before it was default anywhere, and that was also a nightmare.
It's easy to forget sometimes just how much the distro maintainers make our lives easier.
Xorg has definitely become easier to configure. Back in the old days, you had to write the XF86Config file, either manually or automatically during installation, or else it wouldn’t do anything. These days, Xorg auto-detects everything and you only need an xorg.conf if you’re doing something weird.
Yeah, back then I was using one of the glorious Trinitrons at 75 Hz and 1600x1200. To work at all, I had to manually look up the horizontal and vertical sync ranges and put 'em in the XF86Config.
Debian 11, AMD RX560, KDE on X, FreeSync on and working, screen tearing appears in landscape orientation and isn't there in portrait. If that's the driver problem, then show me any reason it works fine on the same buffer size.
I don't think any solution based on video streaming can ever match what X11 provides, which is, remote apps use the settings of the client computer for rendering. e.g. with ssh -X, if I set my dpi in my .Xresources, no matter the machine to which I'm ssh'ing to, I'm always getting a correct font size for my local screen.
I haven't tested but there is no reason that can't be done in waypipe. It works by intercepting certain protocol messages and proxying them over the network. The client just has to be given the output information from the remote machine.
See chapter 7 of the Unix haters handbook from 1994, which was linked here the other day: http://simson.net/ref/ugh.pdf
The amazing thing about Wayland is that it's taken over 25 years to happen. Over those 25 years, X has become less of a problem as CPU speed and RAM have grown exponentially, and we now have GPUs to help it too.
I had to turn on X11 for screen recording once (the screen recording was done by a windows only app running under wine). It didn't take a minute to see extreme tearing. I seriously don't understand how anyone can use X11 other than as a fallback.
Those will cover the capture plugins, but OBS still needs XWayland to run due to a dependency on GLX for rendering. For those interested, there is an open PR here to add native EGL/Wayland support: https://github.com/obsproject/obs-studio/pull/2484
For wlroots compositors there is also the wlrobs plugin, which can be used if you don't need pipewire: https://hg.sr.ht/~scoopta/wlrobs
I think this is a better way to go to get the same performance and low latency gaming capture as on Windows with gaming GPUs.
The guy who made that PR frequently streams coding sessions on Youtube. I think he made it because he wanted a better way to stream some cool live opengl coding sessions. And even though that code isn't production ready, he has used it for some time now and it seems to work great.
If there is some company that slightly cares about Linux desktop and gaming on Linux, I would suggest helping with that pull request and getting it merged. (Anyone from Steam, AMD or Nvidia here?)
Some of the EGL portions of that PR are actually included in the one I linked :) At some point my plan is to go through and merge these all together if no one else does it, but streaming has not been a priority for me at the moment.
Right, I figured capturing is the big ticket item, and that most people wouldn't care what OBS itself runs on. Is there a reason to care about it running on XWayland other than being able to say you don't need X at all anymore? Would you expect to see major improvements for apps that are already doing all their heavy lifting in GLX on X11?
>Is there a reason to care about it running on XWayland other than being able to say you don't need X at all anymore?
I personally don't on my setup but the reason to do it is so other plugins can make use of EGL extensions. Native Wayland support just comes along with that trivially. Future development on platform-integration extensions is expected to happen in EGL instead of in GLX. For a current example the other PR that does direct KMS capture needs EGL to work, even with the X11 backend.
> If you're running a modern Linux desktop you're probably running Wayland
I'm a single data point but I'm running Ubuntu 19.10 and I'm not running Wayland. I don't remember if I opted out during the installation or if I wasn't given the choice.
The top reason to stay on X11 is that no screen sharing application work with Wayland (Meet, Slack, Skype) and I need them a few times per week to work with my customers.
This was more or less your point but from a different perspective.
> If you're running a modern Linux desktop you're probably running Wayland
I believe the big exception to this is Nvidia. It looks like things might be changing, but until quite recently, the Nvidia proprietary driver was X11 only, so anyone running Nvidia graphics would automatically fall back to X11.
Wayland won't be a thing until they agree on the common API for the real time capturing of the screen. I've read some Wayland developers said screen capturing is not a priority and I can't understand it. The screen capturing demand is higher than ever now we have ubiquitous live streaming sites everywhere and people earn the money from it. Besides, the easiest way to explain how to use GNU/Linux desktop for the complete beginners is by the videos.
The common API is the desktop portal and pipewire. The major projects (GNOME/KDE/wlroots) all agree on this one. Take a look at the links in the GP comment more info.
THe common API is whatever gains traction. Wayland has even less direction than Xorg development (especially in its early days), because it's a spec with lots of holes that others have to implement and fill in, respectively. Even Keith Packard doesn't think Wayland is on a good track anymore.
The Linux ecosystem needs a standard and unified API or SDK for its desktop endeavour like macOS and Windows does.
This is why this whole thread on having the user to find out if an app like OBS is running on KDE, GNOME with X11 or Wayland on Linux is something which risks itself in losing traction with general users. I always recommend people to don't bother trying out the other distros and use Ubuntu instead.
The Linux community is eternally stuck with its micro-ecosystem of alternatives of alternatives of the desktop stack which is best described by Howl's moving Castle of components.
Also for future Linux app developers, never tell the user to 'compile' something as a way of distributing your app.
>The Linux ecosystem needs a standard and unified API or SDK for its desktop endeavour
In my opinion, this is incredibly unlikely to happen any time soon. The closest existing thing to that is building web apps targeting Chrome and Chrome OS. If that's not your thing, then I would advise against operating on the assumption that there will ever be a unified SDK. At least for me it's gotten easier to understand and work with the open source world after internalizing that. There are both upsides and downsides to it.
Ubuntu is a funny example because they were ready to drop both X and Wayland for a while. They came very close to shipping their own incompatible display server called Mir.
Maybe ChromeOS could do since it is the closest to this idea.
But distro-wise, if that's the case then the second last sentence in my previous reply is an unfortunate tautology which doesn't look good for those who just wants work done or needs to reproduce/trace bugs in subsystems. :(
My point with web apps is that you can target both Chromebooks (technically a "Linux desktop") and any other system that has the Chrome browser installed.
If you're shipping a native B2B application the standard solution I see is to target a specific distro version (Latest RHEL/CentOS, Ubuntu LTS, etc) and tell customers you only support the default desktop. If they want support for some other weird configuration they can pay extra for that.
The desktop portal has gained traction. This is what we have right now, I don't know how to solve the problem of vendor- or desktop-specific features that need to be supported in extensions. X has experienced fragmentation from having to do this through its entire existence. I think the only thing a protocol designer really can do is make it easier to ship extensions. If Wayland does that for you, you probably know it already.
> If you're running a modern Linux desktop you're probably running Wayland
I missed the point when Wayland took over all the major modern distros. Did supersede Xorg now? I've been using X11 forever and never thought of alternatives.
Debian Buster (which is "stable" now) defaults to Wayland in Gnome, but you can switch to Xorg at the login window, which I had to do this week for screencasting to work as expected.
That's mostly the case now, but that's a much more recent development than Wayland on the desktop becoming popular. Also, pipedrive gives you the plumbing, but you still need the portal API. My understanding is everyone more or less agrees that's the way forward, but it's still not stable and ubiquitously implemented. Even then, that resolves capture but not control: if you want something closer to ssh -X you need ways to forward input too, and IIRC right now the main answer for that is still compositor specific, e.g. krfb relying on kwin/KDE.
There's still plenty of things which don't work with Wayland. And even Gnome Shell, which perhaps has the best support for Wayland, doesn't work very well with Wayland on some of my machines (Gnome Shell is extremely slow and jerky at least on my one machine if I try to run it with Wayland).
I've been using screen sharing on Plasma/Wayland for a while now and it works absolutely fine. With krfb remote desktop control is also fully available. The latter uses a KWin specific protocol though IIRC as virtual input isn't part of the portal API.
I am currently running Gnome in wayland and multiple displays with different fractional scaling settings, it works fine. On the other hand in Xorg I can't set different scaling factor for each monitor (at least gnome doesn't allow it), neither can I use fractional scaling.
(Ask me about ffmpeg raw GPU buffer capture one day; running a bunch of codec code as root is always exciting.)
OBS Plugin: https://gitlab.gnome.org/feaneron/obs-xdg-portal/
For GTK: https://github.com/flatpak/xdg-desktop-portal-gtk
For KDE: https://github.com/KDE/xdg-desktop-portal-kde
For wlr: https://github.com/emersion/xdg-desktop-portal-wlr