Browsers are not the right tool for sandboxing applications. I don't think it was a mistake, but it's time to take what we've learned and move to the next level.
With a bytecode like wasm, you can create an "app runner" program that's at least an order of magnitude less complicated than current browsers. Just ship apps as wasm binaries with a simple interface (maybe WASI, haven't taken the time to dig into it yet) for requesting/providing filesystem, networking, graphics, etc resources (if you think this sounds like java, see [0]).
And I think there's room for diversity here. I can imagine a world where it's normal to have 3-4 different app runners installed on your system. Developers would target one of these runners for their app, depending on needs (performance, GUI libs provided, security guarantees, etc), and tell you which app runner you need for the app.
You've just described the hardest parts of a web browser, while ignoring the necessity of a JIT compiler for your bytecode.
To give proof by counterexample, the "simple wrapper" is a graphics stack like GTK alongside the POSIX APIs. It turns out this is insufficient for application development.
Some things you're missing are text layout and accessibility. Neither are small or simple.
The browser is the new OS and they're big and complicated like OSes by necessity, not accident.
All the things you mention could be available as a shared library, to be run inside the sandbox along with the application.
Let's keep the sandbox as simple as possible. We can build complexity inside it, and not make complexity part of it. This makes security guarantees much simpler.
The issue I see is we don't need a sandbox at all for this, we need capability based security models for applications provided by the OS vendors (like iOS, Android). Apple has taken the right steps towards such a model, yet get lambasted by developers for it. We have flatpaks and snaps for Linux, the former isn't really targeted by many and the latter is similarly lambasted by users and developers. I don't even know what Microsoft recommends today, since no developer is going to use it and that rec will change tomorrow.
While technically, complexity is not necessary, pragmatically the sandbox offers next to nothing for users or developers without OS vendors enforcing its usage, or the sandbox being so compelling a target for developers they accept the limitations of the sandbox - which is why the only platforms with robust sandboxes are mobile and the browser.
IMO iOS and Android are terrible examples of how to do this. They break decades of convention in how to write POSIX-like applications, and force developers into locked-in practices.
Android gives lip service to the NDK, but in practice you can't really run many useful C/C++/Golang/Rust/etc programs on Android, because the system breaks POSIX. You can't keep a simple webserver running without starting a foreground service, which requires Java/Kotlin. There's no /etc/resolv.conf for DNS resolution, so you have to use custom servers or use the Android NDK compiler (which can be a huge pain). Plus they lock down the filesystem more each year. Android 10/11 are a complete mess on this front. Fortunately they've rolled back some of the more draconian changes, but only because of developer outcry. The fact that they even wanted to do some of the things in the first place is sad.
I'm not saying this is an easy problem to solve, but it's obvious that the big boys have only made a token (if that) effort to build a good, open way of doing it.
I think proper sandboxing on a traditional Unix is hard enough that ~nobody does it right; you can't let any apps run as the user if you want to prevent them from snooping on each others' files, but the user needs to be able to control all of them.
I don't know of a justification for removing /etc/resolv.conf, but I guess "applications' networking should be under the control of the user" is sorta necessary; I think it'd make more sense to do it with network namespaces or iptables rules or something, but /shrug.
>> you can't let any apps run as the user if you want to prevent them from snooping on each others' files, but the user needs to be able to control all of them.
I was thinking about that recently and my conclusion will be highly controversial.
In the same way Wayland implements security that X never could, I think the GUI might be a place to implement file and folder access permissions. If the user does not grant access to a file through a dialog, the app couldn't read it. Same for folder access, which would grant an application access to an entire folder or perhaps an entire hierarchy.
Obviously this would break some things, but real change may require some of that. How much, I don't know.
>IMO iOS and Android are terrible examples of how to do this. They break decades of convention in how to write POSIX-like applications
That's a good thing. We should break "decades of convention in how to write POSIX-like applications" if we want to get any further than POSIX-like OSes and programming models...
> Squeeze extra performance out of a device to achieve low latency or run computationally intensive applications, such as games or physics simulations.
> Reuse your own or other developers' C or C++ libraries.
Writing POSIX CLI applications is not a goal.
ISO C and ISO C++, alongside OpenGL ES, Vulkan and OpenSL are more than enough to have something like Qt running.
I’m a big fan of the capability-based photo access in ios; when an app tries to access my photos it opens a system dialog which allows me to select the set of images the app can see.
That's on them ("regular computers"). Their OSes could offer the same capabilities to map to.
That said, it's not:
(a) as if there aren't cross platform photo apps that run on iOS and "regular computers". Photoshop and Lightroom come to mind immediately, Affinity Photo also. And those aren't just iOS and Mac, they're also on Windows. So there's that.
(b) as if what would slow you down / prevent you from doing such a cross platform photo app would be the iOS capability / image access request feature.
You're completely right, of course. Seeing the amount of effort put into virtualization, sandboxing, containers, etc. is enough to make one wonder why every enterprise is so shortsighted. Capabilities obviate all of the above, and are neither a new untested idea nor more complicated than what we have--as we clearly see in the repeated attempts to build things that approximate them with ACLs and sandboxes and seccomp rules and so on.
Linux is slowly moving toward capability-centric design with more and more comprehensive namespacing and file-descriptor-based interfaces to things like pidfd and memfd, but there's still a long way to go before we can jettison ambient authority such as the filesystem entirely. Meanwhile, Google's Fuchsia may deploy a capability-based operating system to the masses, but it will likely only be used to sandbox applications written for Android anyway. The real potential of capabilities is to simplify the interface of a power-user operating system by eliminating the race conditions, side channels, privacy leaks, firewalls, virus scanners, and unix-style permissions from developers' and users' day-to-day experience completely.
There will still be memory-safety zero days, of course, until we abandon languages where humans are statistically incapable of writing memory-safe code.
People (developers) want a target (sandbox) that is as "open" as their browser. This means that by definition the slightest presence of business-motives would ruin the entire idea (your example of Ubuntu's snap and Apple's iOS and Google's Android).
I think you're confusing two very different concepts - the fact the browser is sandboxed is orthogonal to its "openness." Sandboxes need not be open, and open platforms need not be sandboxed.
Fundamentally, sandboxes restrict the possible functionality of executable code to prevent it from violating some preconditions within the context a user invokes it. Rarely do developers desire that behavior (it makes it harder for your app to work!), and users complain about it too (when MacOS began requiring explicit permissions for many entitlements, many users complained about the constant nag screens). Browsers have historically needed to reinvent every wheel provided by operating systems to provide features developers require that can't work within the sandbox (cookies, websockets, etc) which prevented entire classes of application from existing.
Browsers have enormous business incentives behind them, I'm not sure why you think that ruins the idea. Google spends gobs of money to prevent anyone from developing a competing non-Google browser, for example. Apple bans any non-Apple browser from existence on iOS and iPadOS.
I like this vision of shared libraries sandboxed inside a minimal runtime to replace the browser. But there are some serious problems with it (that I hope can be resolved somehow).
Replicating a significant fraction of the functionality of a browser ends up being a lot of code and data. For a web-like experience with instant navigation between apps it will be necessary for these shared libraries to be really shared, as in almost all apps use the same libraries and the same version of those libraries. Otherwise you'd be downloading and unpacking and JIT compiling dozens or hundreds of megabytes of code and data before you could display a single line of text, every time you click a link to a new site. The linkable nature of the web would be lost. (Yes, many websites are dozens of megabytes already, but the bulk of that is delay-loaded and happens in the background after the initial content is displayed).
In order for the standard libraries to be actually shared, everyone would have to agree on the standard libraries to use, and the standard libraries would have to be designed to be backwards compatible so that older sites could be force-upgraded to use newer versions. This is essentially the role that browser updates play today. But the browser evolved over decades and it's not clear how you'd get everyone to adopt a new set of standard libraries today, nor how you'd get everyone to agree on what functionality to add over time (the role that W3C plays today).
For libraries outside of the standard set, it sounds tempting to allow them to be shared too. But this is actually an unavoidable privacy violation. If you have a shared cache, then any app can discover the set of previously loaded libraries with a timing attack, thus revealing information about the user's browsing history. This is why all browsers are now moving to partition all caches by site: https://www.jefftk.com/p/shared-cache-is-going-away. Even if multiple sites reference the same wasm blob, it must be downloaded and JITed separately with no sharing.
As I said, I like the vision of a tiny low-level runtime executing shared libraries inside a sandbox. But I'm not sure how it could be done in an efficient and privacy-respecting way if you want to replace the Web with it.
Text rendering and accessibility APIs both need to access the host OS’s APIs to work correctly. (Text rendering is subtly different on macos and windows, and app rendering behaviour should follow the platform.)
I think we can expose standardised APIs to applications for this stuff, but the API will probably need on the order of hundreds of methods.
While it's probably true that a full gui ecosystem could be built on top of wasi+webgpu+windowing, it would mean that programs would often reinvent the wheel. Perhaps that's not a bad thing.
Speaking of moving complexity into sandboxed code, browsers could potentially run js engines inside the wasm vm, reducing a lot of complexity there.
>While it's probably true that a full gui ecosystem could be built on top of wasi+webgpu+windowing, it would mean that programs would often reinvent the wheel.
You can compile existing GUI libraries you just need to port the rendering backend and input.
As others are pointing out here: the cross-platform value-add of browser based apps is the GUI layer, not the code runtime. We've had the JVM for decades, and it even has a GUI layer, it's just a much less powerful GUI layer than today's web.
People have been trying to create a simple, native, platform-agnostic GUI layer for as long as GUIs have existed. Nothing but the web has ever come close to succeeding. It may be a flawed solution, but it's what we have, and it works, and I wouldn't hold my breath for a greenfield alternative supplanting it any time soon.
Sorry, but I think there's some stockholm going on here. People use the web for apps because people use the web for apps. It was in the right place at the right time and an extinction event chose JavaScript. That doesn't mean we can't make something better (doesn't mean we can, either).
I really wanted to like Swing - despite it really being an overengineered mess, with some IDE GUI tools for creating UI layouts (a crutch that you might want to use, given how messy the code-first approach would be), it was actually a passable way of creating cross-platform GUI apps. Though you'd still need JDK, which was a drawback (and it wouldn't look or feel native, but then again, at least it worked). JavaFX/OpenJFX, despite its advantages, somehow feels worse integrated into the development workflow that some people like me used to have.
Never really got into GTK+ or Qt, or other tools that are more wieldy in the lower level languages, i'm afraid, though i'm sure that with enough time investment, bindings for other languages could be found and you could use them well enough.
Though the closest i've seen to "easy" cross-platform GUI development that actually respects the underlying platform (unlike Electron), but that also doesn't make your hair go gray, was the way FreePascal + Lazarus handled it, by attempting to support multiple frameworks in a transparent manner, where possible ( https://en.wikipedia.org/wiki/Lazarus_Component_Library ). Now, normally something like that would fail, since abstractions are leaky, but it seems like a number of useful pieces of software was actually written that way ( https://wiki.freepascal.org/Projects_using_Free_Pascal ).
It's just sad that the language is kind of dead nowadays, as is the idea of having up-to-date tooling around it, or even many tools for developing web APIs. :/
If you're trying to build out an arbitrary UI, there is no GUI layer today that's more flexible than the web (in terms of screen sizes, customizability, layout styles, look and feel). Maybe you could argue that this flexibility arose because of the popularity it gained by being in the right place at the right time, but at this point the network-effect is not the only reason we use it. It has genuinely become a more powerful platform than the alternatives.
I'm not saying we couldn't do better - the CSS API has a lot of warts and the DOM has performance issues if you don't know how to avoid them - but I am saying we had decades to do better and didn't, so there must be something very difficult about doing better and we shouldn't discount the only thing that's ever really succeeded.
> but I am saying we had decades to do better and didn't
If you ignore everything outside the web, sure.
Games routinely render thousands of extremely complex objects at over 60 fps. The browser cannot animate a `height: auto` element or display more than a few thousand simple elements without stuttering.
Clearly we have done better. Just not on the web.
And don't get me started on actual complex UIs. TurboVision from the 90s, Qt and Delphi from early 00s had more power and ease to create complex layouts than anything the web can offer today. You can't even implement a proper accessible modal dialog on the web without tearing your hair out.
You're talking purely about performance. Sure, you can do whatever you want in a game UI, and it will probably outperform a web UI. But there's a reason game UIs often scale/function badly as soon as you change the screen size or the input type: they aren't built on a platform that has flexibility baked into its bones. In many cases they aren't build on any kind of platform at all. They often have element/text positions hardcoded to a specific screen shape. Which is exactly what the browser spends most of its effort preventing: all sizing and positioning is determined automatically based on rules and content, which can be expensive to calculate, but is what makes it so robust.
Qt still exists. It's still being actively developed and still gets used for some modern software. It's actually a perfect stand-in for the current peak of native cross-platform frameworks. It's a contemporary of the web, it supports all the platforms, it has a rich feature set. It even directly supports several languages, something the web doesn't really do. And yet: it's only used in a very small fraction of the apps out there. Why? Because (if you ask me) it simply isn't as flexible or accessible of a GUI layer. It's a very performant GUI layer, and it works very well for certain types of applications. But the web works pretty well for all types of applications. And that's exactly my point.
> And yet: it's only used in a very small fraction of the apps out there.
Qt is estimated to account for ~10-15% of all c++ development, which is a very fair amount for any one single framework. You just don't see it because most of the time it blends in perfectly. Do you think about Qt when you're using your LG TV ? (LG WebOS is Qt). Or OneDrive Desktop ?
> Sure, you can do whatever you want in a game UI, and it will probably outperform a web UI.
Not probably, but definitely
> But there's a reason game UIs often scale/function badly as soon as you change the screen size or the input type: they aren't built on a platform that has flexibility baked into its bones.
Yeah, no. Games have had UI scaling since monitors got more resolutions than 640x480.
The reason why the web is horrible at UI performance is simple: it was never built for UIs. It is, at its core, a system to display simple texts with a few images in between, and that's it.
Look at the sad state over at https://csstriggers.com. Setting text shadow will cause a full-screen reflow. What does this have to do with "flexibility baked into its bones"?Nothing. If anything, it's entirely inflexible. Same for any type of animation apart from a handful of css transforms: touching any part of an element will cause a full-age reflow (and that's why animating `height: auto` is one of the web's holy grails). This has nothing to do with "changing screen size or the input type". The web was never built for this, and anything outside simple-text-plus-a-few-images is just a bolted-on hack that barely works.
> all sizing and positioning is determined automatically based on rules, which can be expensive to calculate, but is what makes it so robust.
iOS uses a significantly more complex system layout constraints not dissimilar to Cassowary. Cassowary was proposed for the web in the late 1990s and was rejected because it was considered to be too expensive to calculate on the hardware of the times. The vast majority of the web was designed with one goal in mind: avoid expensive calculations. To the point that the actual algorithms to decide how to layout and draw stuff are specified in the standards. And the core of the web is designed around ... a single pass. The first time an allowance for two passes appears is for some stuff in tables (was it vertical alignment?) in early 00s.
And the actual complexity on the web comes from the fact that it was never designed for any of this stuff. And, of course, there's no robustness to speak off. Even looking at the web sideways will break it in innumerable ways: from the fact that just asking for an element's size will cause a full-page reflow to inability to properly encapsulate anything.
> Qt still exists. It's still being actively developed and still gets used for some modern software.
I know it still exists. What I was saying that Qt and Delphi in early 00s and TurboVision in 1990s had tools that surpass anything the web has to offer today. Which is not unusual, because they were developed as UI frameworks/libs/tools. Unlike the web. It can barely display a simple page with text and images, and now we've built an unholy mountain of hacks on top of it.
The web is the first and only UI system that lacks any and all tools to actually build and customise a UI beyond a very restricted set of badly interconnected very rigidly defined primitives. Where any UI lib/framework gives you a rich set of complex controls, the web gives you a few form elements with little to no functionality. Where any UI lib/framework gives you a plethora of tools to design/change any control or primitive you want (going as far as letting you redefine your own paint functions), the web gives you nothing.
I keep repeating this in various comments, but: there's a reason why most CSS/UI frameworks on the web repeat the same handful of primitive elements: a badge, a text input, a button, a breadcrumb, an avatar. Because that's all you can do in any reasonable amount of time. Very very very few attempt a date picker or a complex dropdown (inevitable with significant a11y issues). More complex UI elements like virtual lists, tree views, master-detail sheets, proper data tables, truly complex layouts where you can actually mix and match any and all components? Ahahahahahaha. And of course, nothing even approaching visual editors of early 00s because they are simply impossible on the web.
> But the web works pretty well for all types of applications.
The web performs poorly for any type of application and performs barely well enough for pages.
> Games have had UI scaling since monitors got more resolutions than 640x480
Scaling is different from resizing. Maybe it was my fault for using the wrong word.
Have you ever tried to play an old game on a widescreen monitor? Even a lot of modern, triple-A games have weird HUD layouts on ultrawide monitors because they have to be re-hard-coded for that aspect ratio. So if they haven't, you just get the layout they're hardcoded to for widescreen monitors.
I'm not saying it would make sense to trade performance for robustness in the case of games specifically, because they're always going to be full-screen and usually target one of only a handful of aspect-ratios (at least for consoles and PC; mobile may be a different story). I'm just saying they definitely get to skip a whole lot of hard problem-solving due to the constraints of the end product.
I don't have time to read more than the first part of your reply right now; maybe I will later.
> Have you ever tried to play an old game on a widescreen monitor? Even a lot of modern, triple-A games have weird HUD layouts on ultrawide monitors because they have to be re-hard-coded for that aspect ratio. So if they haven't, you just get the layout they're hardcoded to for widescreen monitors.
Have you ever seen any UI on the web that approaches the complexity of game UIs? I haven't [1]. Anywhere there's such complexity on the web, you get lots and lots of hardcoded values: from hardcoded media breakpoints to entirely different layouts for different resolutions and screen sizes.
> I don't have time to read more than the first part of your reply right now; maybe I will later.
Now worries, I got carried away a bit. As Stephen King calls it, "diarrhoea of the word processor" :)
> Have you ever seen any UI on the web that approaches the complexity of game UIs?
I have personally built tools on the web more complex than the game UIs in your link.
> Anywhere there's such complexity on the web, you get lots and lots of hardcoded values: from hardcoded media breakpoints to entirely different layouts for different resolutions and screen sizes.
...and I didn't use a single media breakpoint in any of them, and very rarely hardcoded any positions or sizes.
The only time I've ever needed an explicit device-size breakpoint, it was for a site that needed to toggle from a laid-out navigation menu to a hamburger menu when the user was on a phone-sized device (or had their window narrowed to that size! the web doesn't care what device you're actually on, just how much space is available). There was a single breakpoint for "phone-ish" sizes, and that was that. It was necessary in this case because, unlike most values in CSS/the DOM, it was a discrete change and not a continuum. In most cases media-queries are a code smell.
Sorry for butting in on this interesting conversation but regarding media queries being code smell, I currently see no other portable way to deliver multi-resolution background images. :-( Without JS libs that is...
"Code smell" doesn't mean "never justified", it just means "you should look at it with a skeptical eye"
Your usecase sounds legitimate to me. Given that you're talking about different image files, you're fundamentally talking about discrete cases, for which media queries are appropriate. The problems come up when people take something that doesn't need to consist of discrete cases - like element sizing, or whatever - and turn it into a set of discrete cases using media queries. This makes the page fragile against unexpected new sizes, instead of interpolating cleanly. Good CSS derives as many of its values as possible by channeling the continuous inputs of content size, viewport size, etc into other continuous values like the size and positioning of individual elements. The browser already does this by default; you just have to implement your vision while preserving as much of that intrinsic property as possible :)
> I have personally built tools on the web more complex than the game UIs in your link.
Then you are a far better developer than I am, and I dare say than the vast majority of developers out there. It will take most developers no time to come up with UI of any complexity in any of the non-web toolkits. Whereas on the web even small tasks like placing a badge on an avatar reliably is a thankless gargantuan task. Can't speak for the game UIs (they are undoubtedly hard) but considering the sheer diversity in them that the web hasn't seen since the flash era...
Canvas isn't far off. The web is excellent for the types of UI that people often want to build for application: nestings of boxes interspersed with text.
So excellent it didn’t have a grid layout until 2018, flexbox landed 2016 and before that there was heated debate for years between table and div to properly align things.
Modern native canvas are hardware accelerated, browser canvas can be, but you never know, which is another reason why HTML 5 games are a glimpse of what Flash achieved.
If that was the case, Google wouldn't be pushing Houdini, and everyone else pushing WebGL and WebGPU.
By the way, already mastered all CSS tricks to force hardware rendering?
Motif might be irrelevant, yet its ideas live on across all modern toolkits for native programming and the Web had to wait for the WPF team to bother submitting WPF grids to W3C as base concept.
Second, there were a bunch of alternatives and the Web won instead of them and not just because they weren't there.
I know, some people like running around telling everybody Qt, GTK, Swing or whatever were or even are better, but they are seemingly not for the points that matter for most of GUI apps.
But yes, we should make something better going forward.
Yes. The information that "Stockholm syndrome" is a bogus pop psychology theory disputed as to its reality today, has proven extremely useful anytime the core argument of the other side is that something was/is "bad" and has only been retained "because of Stockholm syndrome".
- click a URL to download an installer.
- run the installer that has access to large portions of your system (or if it’s an older windows program probably needs to be run as an administrator so it can add an auto updated or some nonsense like that).
- run your program.
That's unnecessary. I ship my apps as single-file executables that are ready to go. In the worst case you can use a directory like zoom does (on Linux at least) which contains all the libraries and dependencies you need.
Thats still way more awkward than what the web provides.
- I don't need to uninstall / delete a website when I'm done with it
- I don't need to pick an executable format. (Portable or installed? windows mac or linux? dpkg or rpm? From the dev's website or through homebrew/apt? Is a portable executable even available for this application?)
- I don't need to give the author of the webpage access to all my local files.
Native programs are also often 10x+ as large as websites. (Eg the facebook iOS app is somehow 488mb)
These problems are all solvable. Wasm alone will go a long way to solving the sandboxing and app format problems. But we still need to build the platform, whatever it is.
Right! And despite that, it loads in seconds. Traction would be very different if people had to install facebook on their computers to try it out!
"Oh yeah its this cool social network.. just download - no, not the portable executable version... yeah, its 300 megs... Then you need to click it to install it"
I think this is mostly a problem with how the modern operating systems work. Your point is still definitely valid for the real world consequences of this, but i think that theoretically things could be way more simpler.
1.) Click "download" to get a piece of software you need (a static executable)
2.) Put said piece of software in a folder called "/apps/$APP_NAME"
3.) Run said software and have it persist all of its data to "/apps/$APP_NAME/data" or some other convention like that.
Perhaps that'd need to be enforced with sandboxing and limiting access to directories, which would then make sharing data across them more difficult, which would end up with something like "/data/$APP_NAME", which brings us back to the horror of Windows' "AppData" and "Documents" and "Program Data", or other folders like that, resulting in what can only be described as an Eldritch nightmare, in which you don't have any idea in which of the 20-30 different folders which particular stores its data...
But theorethically things could be simpler, i've heard people mention how easy installing apps used to be on Apple devices back in the day. And recently, after running the Godot game engine ( https://godotengine.org/ ), which is literally a single executable, i'm inclined to agree somewhat.
Yeah this is my point too! Modern operating systems are made out of code, and I know it sounds weird but we can change them!
Ideally I'd like applications to work the same way websites do. I want to be able to open an application directly from its URL, and it should download and run a contained wasm pack file in a sandbox. The app should then be cached locally - maybe with some ability to pin it to my start menu (and pin the cache entry).
The app should be given some access to a rendering context (with fonts, sound, etc) and some limited storage. And not much else by default - though the user should be able to bless the application to access other files on the filesystem. There's a huge laundry list of APIs the apps will need eventually - and thats going to be a lot of work and I have no idea who'd fund something like that. But it'd be fantastic if we could pull it off.
Biaised reasoning: a JVM app requires the JVM and and a web app requires a browser. None of those app are executable y themselves. The only difference is that one of those runtimes is shipped in most OSes (the browser) while the other is not.
It depends on what kind of application you mean. Headless servers doing basic file and network I/O are pretty reasonable to sandbox. This is what Docker does.
But GUI applications are a whole different thing. The standards simply don't exist to make that easy. It's not just basic graphics. Things like internationalized text rendering, text input, and accessibility are ridiculously hard and only a few mostly-complete implementations exist. (Such as in browsers and operating systems.) And they need to keep moving as the culture changes; you want to support the latest emojis, right? Well maybe you don't, but your users will.
Furthermore, these implementations are at least partially specific to particular programming languages. If you want something portable, you might build on the Skia graphics library but that won't give you a way to build an app on its own.
Indeed; the web provides far more than just a standardized, sandboxed, cross-platform runtime with multiple independent implementations. There are also lots of higher-level components which benefit from standardization. Text rendering and input are just the tip of the iceberg.
For example: navigation (back, forward, refresh, etc), URLs, permission management, text search, image/video rendering, scrolling, credential management, etc; the list goes on.
That isn't to say there couldn't some day be a simpler alternative to the web offering similar features; but it has some pretty stiff competition. The only thing that even comes close today are modern operating systems like iOS and Android; and none of those are open standards like the web is.
Most of the things you listed apply to web documents more than web apps. I'm an advocate for separating the two. The UX for browsers is great for what they were designed to do. Running apps is not it.
They apply to both documents _and_ web apps, and the behavior is standard across both. When I hit "Back" or Ctrl+F in a web-based chat application the result is the same as when I do those actions while browsing a news article, and that's a good thing.
Huh? When you click the back button in your chat app, if it happens to do something intuitive it's because that behavior was hand-coded by a JS developer using the history API.
And good luck even getting to back button from your news site through all the ads, lazy-load images causing reflow, and auto-play videos that wouldn't be possible in a nicely designed minimal document viewer.
It might be harder for news site operators to fill their sites with junk if they had to explain to their users why it needs to run in an app browser with 20 privileges rather than in their default document browser.
> because that behavior was hand-coded by a JS developer using the history API.
Yes, the app developer (which happens to be using javascript and makes their app available via https://) decided how the back button works in their application. The only difference between the web and an UEFI-based OS is that the browser has defaults for the back button or ctrl-f, while the traditional OS will, at most, handle the window border + minimizing and maximizing.
The same goes for hitting the back button in an Android app. Just because it's _possible_ for apps to misbehave doesn't mean it isn't good for those standard OS features to exist.
Ever tried hitting the back button in like, TurboTax web app? The web is totally broken for apps. I think what you are referring to here is the _intention_ of the web to be consistent. The reality is far from this.
Docker doesn't provide a strong security boundary. It's a sandbox only in the sense that it separates the container from the host from inadvertent entanglement (mostly).
(And the nonportability of images is another thing)
I think Wasm has the potential to become the lingua franca for software.
It allows containerization in a much lighter way than virtualization (KVM), but also in a platform-agnostic way.
> Developers would target one of these runners for their app, depending on needs (performance, GUI libs provided, security guarantees, etc), and tell you which app runner you need for the app.
We had this. ActiveX, Flash, Director, Air, Silverlight, Java. It was a mess, and I think you'd be hard pressed to find very many folks who found it desirable.
There are reasons each of those failed that have little to do with the core concept.
And to be honest, I think Flash was killed before there was a suitable replacement to fill the gap. It took more than 10 years since it "died" to actually get rid of it, because HTML5 isn't actually up to the task.
I think the fact that it survived so long proves there's a use for these types of runtimes.
I've commented before on this subject, but Flash didn't survive because it took HTML5 10 years fill in the gap. Flash survived because:
A) enterprises don't upgrade or move away from technologies period (we still support IE11 and old Windows Server deployments at my current company).
B) Flash the authoring tool never got replaced. It's not about whether or not HTML5 is up to the task of being a Flash-equivalent runtime, nobody built a replacement for Macromedia/Adobe Flash targeting any runtime anywhere. To this day, there is not a vector animation tool on par with Flash for any platform, and that has nothing to do with the capabilities of the web.
People blame HTML5 when what really happened was Adobe abandoned their authoring tools and nobody built an alternative because we expected the W3C or somebody to do it for us. But a programming language will never be a replacement for a full-featured content creation tool and IDE.
Even WASM is not going to replace Flash unless some company or group somewhere actually sits down and writes a native program that can target it and that is actually competitive with what Flash the authoring tool offered devs. I have not seen serious effort in that direction, so if you're hoping WASM just magically makes these problems go away, I'm a little less optimistic about that. WASM is just going to be another runtime target -- a good one, but not a replacement for an IDE.
Yes, my point would more accurately be made by replacing "HTML5" with "HTML5 platform/ecosystem". Your point about Wasm not automatically guaranteeing this is important. But it would be much easier to build such a tool for a simple greenfield runtime than for a modern browser.
I have reasonably high hopes for WASM as a runtime, so I'm trying to thread a line between agreeing with you and disagreeing with you.
I do think that compiling C code to WASM is easier than compiling C code to Javascript, and I do think that WASM is going to open some doors to sandboxed native applications that the web just can't handle.
But at the same time, even forgetting about the cross-platform part, I have not seen an application framework for any platform -- even closed-down, highly consistent platforms like game consoles -- that offers the same feature-set as Flash.
So WASM may make this easier, but we have platforms already today that are extremely easy to target. If you're targeting PS4/XBox, you know exactly what drivers, software, and graphics stacks are going to be installed. And a Flash replacement doesn't exist for any of those devices. So I urge some caution about assuming that the tools you want are going to be quickly built for WASM.
In some ways, this is exactly the mistake that HTML5 advocates made with Flash. They assumed that all they needed to do was have APIs, and Adobe would do something to save Flash. But Adobe never really seemed to care. They had some half-hearted efforts to target devices like Android, but the Air runtime was never really good at that, and it didn't take over as a cross-platform desktop target the same way that modern Electron has. It seems intuitively obvious that if a cross-platform runtime existed that everyone loved, somebody would make a good graphics stack for it that was easy to use and that didn't require programming, but I'm not 100% certain that's actually true.
I think we're disagreeing less than you think, mostly just focusing on different things. I'll defer to you on Flash. I've never programmed in it directly. My observations are purely from the outside. I was just using it as an example for why I see potential in app runtimes.
>> Browsers are not the right tool for sandboxing applications.
Browsers have been picking up the slack where OS and desktop developers have fallen down. The OS is supposed to handle process isolation and resource access, but here we are. Tabbed browsing became a thing because GUI toolkits didn't do multiple app instances (or MDI) in a way people liked.
When the browser first came out, the OS didn't even have a native TCP/IP stack! The internet was an app you installed. For a long time the core purpose of the browser was to pickup the slack/gap in the OS treating the internet as a first class thing.
Now, as the internet clams down a bit, the OS handles everything natively, including the security primitives (at least on mobile) and so the purpose of the browser is sadly diminishing. It's a less useful experimental side channel to the OS.
A significant point of wasm is runtime-independence. If a runtime implements the right apis, any wasm module that uses them should be able to run (performance aside).
> If a runtime implements the right apis, any wasm module that uses them should be able to run
That sounds great right now, when the APIs are relatively small, but will quickly become a nightmare when they grow in complexity.
I mean, technically any browser that implements the right stuff can run any web page on earth, yet somehow we still manage to have incompatibilities even when there are only 3 implementations. Or how about POSIX? How much software needs no special casing to run on all POSIX systems?
This is the pattern: we abstract things farther with the (mistaken) belief that it will save us from having to deal with messy reality, but as soon as you try to do anything more than simple stuff you find that reality is, in fact, a mess and you have to work around your abstraction to deal with it.
Yes, agreed. It will be interesting to follow WASI for this kind of thing. I suspect it will try to rebuild POSIX like use case from the ground up. However, there are going to be so many use cases that fall outside the WASI boundary. Wasm is just getting started.
Don't underestimate programmers and especially web developer when it comes to making software dependent on IE or Chrome just because they can't be bothered to do any QA ;-)
Full disclosure: I am a developer, and I use and always used Firefox, for both pragmatic and ideological reasons.
Chrome impliments the necessary features for wasm and webgl. Here's a test: checkaux.github.io
Mobile firefox lacks simd, i haven't been able to get its shared memory to work on a local or deployed environment because of bugs in what it thinks is a 'secure context'
Desktop and mobile Firefox does not support OffscreenCanvas without turning on flags. It's been years since this was meant to be made. Therefore you cannot use seperate threads to render to webgl without using less performant workarounds.
Lack of proper threading means bad page performance.
There's lots more dumb bugs and features firefox said it would support and simply never did. Understand there's real reasons as to why chrome is the preferred. it works and it actually puts in the necessary features.
Then there is no point in having a browser platform, because we are back where we started from.
Then why don’t we simply run JVM apps, there are at least a handful of implementations, and it has at least a decent GUI (half joking here).
But really, the JVM is where WASM may be several years later.
VM's are not all the same, and a VM does not at all guarantee security.
The JVM is not designed to run arbitrary code and/or is not as well designed to do it as Javascript or Wasm is.
If it were as simple as plug JVM into browser it would be done. However it's not, because the memory and execution safety measures are better in Javascript and Wasm.
Wasm should actually have better isolation than JS.
If you want to know the details go read the spec and look at the implimentations.
Java Applets were a security risk, but it was decades ago. Modern JVM has many options to sandbox a running app.
But it seems that WASM tries to be a mininal CPU target instead, which is questionable - I can’t wait seeing memory errors (even if only inside a single webpage) on the web. With the good quality GCs available with JS, but even more so on JVM, I would say that running system languages should not be the point.
The idea makes a lot of sense, but it all comes down to the system-level APIs that this application-runner would provide. Should it be Web-APIs? Then you're essentially rewriting the browser, maybe minus the DOM, CSS and JS. Should it be the Windows APIs? Then you're essentially rewriting Windows above the kernel. Just POSIX? Then what about applications that need rendering and sound. Do you invent your own cross-platform APIs? It's a massive undertaking, and without the momentum that the web had and has, it'll be tough to make this application runner popular among devs and users.
It seems like your argument is the opposite of the point made in the essay you linked. You say a browser is unnecessary, but the essay says that WASM won because it integrates with the rest of the browser.
Ha, you're actually kind of right. That article doesn't quite say what I remembered it saying. Been a while since I read it. I'd say about half the points are still applicable, but it is more pro-browser than me for sure.
Too late to edit, but Steve's previous blog post probably presents my point better, though with less focus on the vs Java argument:
A low hanging use case would be scripts for a distributed data analysis system. e.g. You type in a function and then it compiles to wasm and is sent out to run on worker nodes. Or you write code snippets for code to run in a distributed dag.
This exists with e.g. shared file systems (how hpc works, how hadoop works) and Python can pickle code and send it over a socket (be aware that this is a security risk). WASM seems like a nice and direct solution for moving code to data.
Running the code at the server means it is less likely to be seen by the user. I used to program in ASP 3.0 with VBscript. The user sees HTML code as the script is run on the server and Html code is rendered for the user. Also the user is less likely to hack the code if it is on the server.
'For everyone asking, but why use a browser technology server side? Just run a binary' or 'Whats old is new again, java etc' - I am sure you've heard of a browser technology that is used server side, javascript, ever hear of nodejs?
The rise of javascript and its server side runtime is due to the immense pressure that language had to evolve to make the web what it is today.
Web-assembly is the next step and evolution of this runtime. I can only imagine the sheer number of man hours poured in to javascript to offer the functionality it has today while balancing with security and sandboxing. Nodejs is a different story; Take a look at Deno and its origin.
Web-assembly is sandbox-able run anywhere, ubiquitous runtime which isn't owned specifically by a corporate overlord, and is also which has/or is set to have the evolutionary pressure javascript has.
This is huge! Way to go Wasmer :)
You’re missing the point. It’s about containerization.
To quote a Docker cofounder:
> If WASM+WASI existed in 2008, we wouldn't have needed to created Docker. That's how important it is. Webassembly on the server is the future of computing. A standardized system interface was the missing link. Let's hope WASI is up to the task!
> offer the functionality it has today while balancing with security and sandboxing.
LoL, LoL, LoL
You can say this only because every other operating system sucks at sandboxing.
There is no reason you have to be inefficient as browser to provide security and sandboxing. In return there is also a huge attack surface, with all the code running below, and corner cases in web standards. AFAIK any js click event can allow a website to read your clipboard. Websites can somehow prompt to add extensions to your browser etc.. I know there are legitimate reasons for this but it sandboxing is not exclusive virtue of web platform.
And I won't be confident using browser too. Pretty sure at any given time there will be quite a few javascript/wasm attacks which can escape browser sandbox and run arbitrary code, then my files won't be safe. When I am running potentially malicious website I am creating a separate user on my linux box anyway.
BSD jails? I'd be quite confident running possibly dangerous code in one of the most restricted jail configs. Processes can't interact, can't access outside their given directories, can't modify devices, etc.
I like WASM for technical reasons, but I do not think it will stand the test of time and become a major part of how programs are run.
Mainly because the areas where it used have other solutions that are “good enough”. In the browser JS is good enough. On the server Docker + your typical Linux program is “good enough”. In time things like KVM will offer great sandboxing to containers.
Another issue is that server side programs depend on system interfaces of Linux, which are battle tested, observable and trusted. With WASM you need to replicate that with some message passing system.
The article doesn't mention the number/class of CPUs in the system used for the benchmark. It claims that 'Singlepass' compilation of clang.wasm took 2s and 'Cranelift' took 9s.
Running the same benchmark myself in Linux on a 4CPU VirtualBox on a MacBook Pro, I see 11s and 42s respectively. For reference , V8/Node.js compiles the same file in 27s.
Compiled files were 339 MiB (Singlepass), 640 MiB (Cranelift) and 90 MiB (V8/Node.js).
True, but for a cpu bound task like this VM typically causes only a 10% slowdown. You aren’t going to see an order of magnitude faster execution on real hardware.
The article makes it look like compilation is almost instantaneous. It’s not. Cached artifacts are rather hefty as well.
V8 fares surprisingly well given that it’s not a specialized WebAssembly runtime. Wasmer should probably watch competitors more closely.
>> We believe that WebAssembly will be a crucial component for the future of software execution and containerization (not only inside the browser but also outside).
Why? I don't get that or maybe miss some crucial parts or maybe it's just a too enthusiastic statement. I understand the security issues of running external code. Those security issues can be reduced with a sandboxed environment like a WA runtime.
On the server side the vast majority of the industry is running their own code. There are already many options like lambda-functions, hosted container environments, hosted VMs or of course running your own. So why take this extra step? Running plain binaries of your code is easier, faster and more reliable. Building, distributing and running a Go service is actually one of the easiest parts in the whole development and operations chain. Same is true for Rust or .NET Core. Even with Java where you need a runtime I don't see how a WA runtime solves a problem at all when running your own code.
I see great potential where you need to run untrusted code. Customer plugins for some edge server. But that is really a niche in terms of overall market.
I see the main advantage in this as running a polyglot application as if its native. For example if you are making a Rust application, you can have a plugin architecture where the plugins run inside a wasm compiled python interpreter.
I dont know if doing this directly and cross platform is as easy without wasmer
No need to recompile means you can run the same, guaranteed known, signed, etc code everywhere.
Purpose-built sandboxing means you can run untrusted code more safely — e.g. at the network edge, like a Cloudflare worker, a lambda, etc. Same for third-party plugins in apps, etc.
An open standard and open implementations gives you some assurance against lock-in, licensing changes, patent suits, etc.
The JVM is rather heavy weight and typically slow to start even when you do have enough resources to run it. That rules out edge computing. Graal AOT compilation is more similar but not a general purpose thing currently as it only works with tools and languages optimized for that.
I say this as a Kotlin/Java developer. I know and love the jvm but it's just not a lightweight option suitable for edge computing or serverless computing (it works but the jvm startup overhead + binary size is annoying).
Java has had AOT compilation support since around 2000, when commercial JDK vendors started supporting it, what it lacked was free beer AOT compilation, as GJC never really did it without issues and was quickly abandoned when OpenJDK came into the scene.
Isn't the JVM too high level for compiled languages without garbage collection? E.g. can I efficiently compile C to the JVM without asm.js-style hacks which degrade performance?
I like web-assembly for the same reason I like Rust, actor-model and compile time checked state machines. It makes it more plausible that we can build robust services in the future. And it even feels like Computer Engineering when I'm working with these concepts. To me the web-assembly runtime is the next iteration of isolated applications, after containers which came after VMs, and I'm sure HN could come up with a bigger picture, but this is OK for my purpose.
And I know nothing about browsers, GUI and front-end stuff, but Web-assembly is never the less very interesting for me, and I consider it the biggest thing since Docker.
Are you talking about the browser or other environments?
There is a proposal [1] for adding sockets to WASI.
This blog post [2] has some more background info.
Outside the browsers, you can always provide extra functionality yourself though by supplying host defined functions as imports. That looses the benefit of a standard API, but is fine for custom deployments.
Yeh, this article kinda confirms what I've been thinking the past few years:
"Implementing a server requires additional API functions exposed - specifically, sock_bind, sock_listen, and sock_accept"
"And while implementing them is done similarly to what was described here so far, there is an underlying issue: there is no multi-threading support in WebAssembly"
Really surprised to see so many people talking this up to running many/most server-side applications with so many fundamental features missing.
There is of course the threads proposal [1], which works in chrome with a feature flag, but most runtimes don't support it yet.
In general I have to agree that WASM needs more time. There are multiple quite crucial proposals in the pipeline (reference types, interface types, threads, tail calls and other advanced control flow, module linking, (GC), ...) which have all seen very slow progress.
Overall I'm still bullish on the potential of WASM for universal deployment, but the slow pace is a bit concerning.
Because the "MVP" stage is itself production ready, which is why it was stabilized in browsers 3 years ago. Also, some non-MVP features are almost stable, e.g. atomics are now enabled on chrome by default IIRC.
WASI, the posix-like system interface ABI, is still technically a "snapshot", which has made me slightly concerned wrt stability given that Rust can compile to WASI even on the stable channel, but hopefully even the full stable version is backwards-compatible with snapshot-1.
To be more specific, there are multiple companies using this for edge computing. E.g. cloudflare has supported wasm for a while now. At this point it is increasingly widely used in the browser as a way to optimize parts of e.g. react and other popular frameworks. So, the compiler works and is being used by people. So, kind of appropriate to release a 1.0 to signal having some kind of stability.
'Production ready' doesn't mean it has all the features you would ever want. It means it is stable and tested enough to run critical workloads.
They are saying the core is production ready (if your needs are met by the features included in the MVP), but that planned additional features are not yet production ready.
> It is production ready in all (vital) aspects concerning execution in a browser
As in (emphasis mine): "This means that there are important features we know we want and need, but are post-MVP" [1]
As in: "a Minimum Viable Product (MVP) for the standard with roughly the same functionality as asm.js, primarily aimed at C/C++;"
It's good that you specified "vital" when talking about "all" aspects. Where vital is basically "let's run MVP in an isolated sandbox with little-to-no interoperability with the rest of the browser". Because the rest of the vital things like "basically everything" [3] [4] are still MIA.
But yeah, I mean, the modern web was built on a language designed in 10 days, so I shouldn't complain, should I.
With a bytecode like wasm, you can create an "app runner" program that's at least an order of magnitude less complicated than current browsers. Just ship apps as wasm binaries with a simple interface (maybe WASI, haven't taken the time to dig into it yet) for requesting/providing filesystem, networking, graphics, etc resources (if you think this sounds like java, see [0]).
And I think there's room for diversity here. I can imagine a world where it's normal to have 3-4 different app runners installed on your system. Developers would target one of these runners for their app, depending on needs (performance, GUI libs provided, security guarantees, etc), and tell you which app runner you need for the app.
[0]: https://steveklabnik.com/writing/is-webassembly-the-return-o...