Hacker News new | past | comments | ask | show | jobs | submit login
WebTransport API (wicg.github.io)
113 points by Jarred on June 28, 2020 | hide | past | favorite | 66 comments



A lot of people here seem confused why this is needed or wanted with WebRTC. I sat in the BOF session at IETF 106 in Singapore last year when we talked about this. Happy to answer any questions as I understand them. (Though I'm no expert.)

I can't find the video link, but the slides from that session are here: https://datatracker.ietf.org/meeting/106/materials/slides-10...

Webtransport is designed as a 'QUIC-native' client-server communication protocol to replace websockets. You can think of it like websockets but better (faster handshake, it reuses the http2 / QUIC connection, and you can relax websocket's reliability & ordering constraints). Or you can think of it as WebRTC, except way simpler for the 95% case where you want to communicate server/client.

I'm excited about it for browser based video games.

It looks complicated, but there's very little thats new here for browsers and web servers to implement. Remember QUIC is already built on top of UDP, and internally, QUIC supports basically all of webtransport's features natively. Webtransport mostly just re-exposes many of QUIC's features to applications. Browsers and HTTP servers are adding all that functionality anyway - so we may as well take advantage of it in our web applications.

My favorite criticism voiced at the IETF was the name - webtransport has nothing to do with the web, and its not a transport. Its mostly just an application API around some of QUIC's features, with fallbacks for HTTP 1.1 and HTTP2.


Websockets but with UDP, and without all the offer, answer, TURN/STUN complications that webRTC brings? Lovely! Thanks for explaining!


Isn't http2 basically a replacement for websockets already? It's bidirectional. Has streams. Just missing a way to create a stream in browser iirc.


You are right. If browsers would expose APIs which would allow to access request and response bodies in a streaming fashion then HTTP/2 (and actually even HTTP/1.1 with chunked encoding!) can be alternatives to websockets.

The remaining difference would be websockets having builtin frame delimiters and a meaning of text and binary frames - whereas with HTTP streams you would need to do that on the appliction level.

Getting Streaming APIs for HTTP was planned to be part of the ReadableStream and WritableStream extensions for fetch - but I have no idea how that effort went since I last looked at it (5 years ago).


You can read the response body as a stream, so that part works. You can't (last I checked) write to your request body. That's the only missing piece.


Optional reliability and ordering is very important for real-time multiplayer games. Websockets doesn't support this as it uses TCP.


Theoretically maybe, practically definitely not.

HTTP is still very much geared towards request/response semantics, just sometimes serving requests you didn‘t yet know you want to make.

Also, WebTransport seems to be more of an extension of the idea of WebSockets to UDP and other reliability/ordering semantics than an outright replacement.


There’s some discussion around doing HTTP/2 streams (reliable, stream-based) with the fetch API (I am not certain which browsers allow for this), but note that these are only one use of WebTransport - the main thing I’m excited for is doing unreliable datagrams without bringing in an entire WebRTC stack.


Not sure if you missed it or not but QUIC is HTTP3. It has known advantages that are highly valued in domains like games as josephg has mentioned.


Yeah, but what does that have to do with what I wrote? My question is, if HTTP can now do streaming and all of these other useful things that you previously got with WebSockets, why do we need a WebTransport API? Why don't we just use HTTP instead?


>Browsers and HTTP servers are adding all that functionality anyway

I am going to assume things will improve one HTTP/3 is finalised. But right now I dont see QUIC from non-google server at all.

And I cant help but think ( May be it is just me ), are we trying to over-engineer the Internet?

Edit: I just check on the clients side. Turns out Safari 14 may be the first browser to ship with HTTP/3 enabled by default.


> But right now I dont see QUIC from non-google server at all.

You won't see webtransport appearing for awhile either. Its all still in draft status. The IETF doesn't rush these things.

And for what its worth, the spec for Webtransport is being worked on with close collaboration with the QUIC (HTTP/3) working group at the IETF. Everyone wants webtransport and QUIC to be a good fit for one another, and straightforward to implement in browsers once its ready.

I worry about over-engineering too; but the best way to express those fears is to get involved. Standards are written by those who show up. And the IETF welcomes anyone who's interested to join the mailing lists and attend IETF sessions. (So long as you accept the community norms.) Even the in-person events (when they come back) bend over backwards to welcome remote attendance and remote questions.

This is the video of the webtransport BoF session from Singapore last year, if you're curious how this sort of thing plays out in person. There were only about 25 people there:

https://play.conf.meetecho.com/Playout/?session=IETF106-WEBT...

Or the direct youtube link (without the text from remote participants): https://youtu.be/o5cJEuO2-vk


Safari will likely be the last browser to support HTTP/3 as it will only be available in the unreleased MacOS 10.16. HTTP/3 is already in the other browsers with instructions on how to enable [0]. Firefox Nightly and Chrome Dev already support Draft 29 (which is Working Group Last Call [1].

0. https://www.bram.us/2020/04/08/how-to-enable-http3-in-chrome...

1. https://mailarchive.ietf.org/arch/msg/quic/F7wvKGnA1FJasmaE3...


> Webtransport is designed as a 'QUIC-native' client-server communication protocol to replace websockets.

Is there anything wrong with websockets? I thought websockets were cool.


Websockets work on HTTP/1.1 (not later versions) and turns the TCP connection into a packet-based transport, unrelated to HTTP.

WebTransport works on HTTP/3 and exposes a part of the underlying UDP QUIC connection (while still allowing HTTP/3 messages to travel across the same underlying connection), allowing for stream-based or packet-based subconnections, reliable or unreliable. Basically, it’s a lot more flexible and doesn’t require setting up an entire new connection to work.


> Is there anything wrong with websockets? I thought websockets were cool.

I'm excited for the option of having an unreliable connection. Currently you'd need to establish a webrtc data connection to the server, which is a bit cumbersome when you really don't need or want the rest of webrtc.


WebSockets are not cool if you ask anybody who's worked with them in gaming. :)


This is great. One of the biggest hurdles for WebRTC is the sheer number of technologies used to create it.

I might have just found a weekend project.


Can I use WebTransport for multi-tenant non-p2p video conferencing?


tl;dr seems to be "probably" - my setup for this would be a MediaRecorder on your webcam and microphone input, piping data into a modified version of SRT for some level of reliability, error correction and buffering so the data arrives at the decoder at the right time. The problem is that MediaRecorder is not particularly configurable, and seems to spit out something different from the WebRTC encoder.


Folks who are interested in the WebTransport API may find it easier to read the "explainer" document. https://github.com/wicg/web-transport/blob/master/explainer....

It addresses a number of the top comments here, e.g. why not just use WebRTC data channels:

"While WebRTC data channel has been used for client/server communications (e.g. for cloud gaming applications), this requires that the server endpoint implement several protocols uncommonly found on servers (ICE, DTLS, and SCTP) and that the application use a complex API (RTCPeerConnection) designed for a very different use case."


The WebTransport API looks to bring WebSockets up to speed with similar features provided by WebRTC: reliable or unreliable connections and multiplexing. The addition of a Stream interface means it will be efficient for uploads and downloads too.

I can see this replacing most use cases of WebSockets once brought to ubiquity.


Do you know the answer to "Why not just use WebRTC?"? I am used to documents like this having an explanation of the "why", but I am not seeing it here (or an extra stupid today).


I use WebRTC and WebSockets in a side project of mine [0]. WebRTC requires coordination of multiple services and fallbacks (relay servers) when peers can't establish a direct connection due to firewalls among other things.

STUN, TURN, and ICE are a few technologies you need to understand to get started with WebRTC. I'd guess most folks aren't familiar when first looking into it.

The complexity is worth it if you're determined to push most bandwidth costs to clients like I am.

If you want near 100% connectivity at a lower complexity, WebSockets are almost always preferred.

[0] https://github.com/samuelmaddock/metastream


Additionally, during the 3 years I've had experience with WebRTC, I've come across at least 2-3 browser bugs causing compatibility issues when connecting peers cross-browser.

This is just not something you run into with WebSockets.

I'd recommend simple-peer for anyone who does choose to use WebRTC. The maintainers are usually able to smooth these issues out in a reasonable timeframe. <3

https://github.com/feross/simple-peer


I don't think the issues with WebRTC is the protocol, but the tooling. The community did a really good job of creating tooling for Websockets with stuff like https://github.com/crossbario/autobahn-testsuite. There is nothing like that for WebRTC.

We tried to do it, but the IETF event was cancelled https://twitter.com/steely_glint/status/1230447935307026432. I am hoping that in the next 6 months we can have something like that so that all the WebRTC implementations will work together a lot better.


I guess my question wasn't clear; why would they create a new seemingly unrelated spec to try to try to bring WebSockets up to where WebRTC is rather than just add the little tiny signalling bit to WebRTC? (Particularly since people are also simultaneously trying to bring QUIC transports to WebRTC data channels?) We shouldn't be maintaining so many separate protocol stacks and APIs in the browser. (Context: I use WebRTC in my day job, not as a side project, and have been working with it for years now ;P.)


The effort to bring QUIC to WebRTC is the kind of the same effort. The WebTransport is an outgrowth of that effort. They are not unrelated.

As for "too many protocols": QUIC is already in the browser. This is just giving you an API to it. Think of it like WebSockets for the HTTP3 world.


WebRTC is known complicated to set up and quite unstable. Many companies have made a business of configuring WebRTC.


You can create a WebRTC datachannel in any modern browser with just one line of javascript:

     channel = new RTCPeerConnection().createDataChannel("foo", {ordered: false})
Then you can read from it with:

     channel.onopen = () => { ... }
     channel.onmessage = (event) => { ... }
This is widely supported in every modern browser.


What about the STUN, ICE etc stuff? Saying “you just open a connection” is only part of the story.


If you are talking to a websocket like server endpoint, hopefully it won't be behind NAT?


Actually, I'm currently working on an app using WebRTC, and from everything I've read you actually still need a STUN and in some cases a TURN server, even if you have a public IP:

https://bloggeek.me/turn-public-ip-address/

To be perfectly honest, I'm still fuzzy on the why, but numerous blog posts, stack overflow questions, and the Kurento forums agree. This being Hacker News maybe someone with chime in with the technical reasoning.


WebRTC negotiates a peer-to-peer protocol which operates like client/server but has issues with NAT/PAT and firewall traversal. Each peer will realistically only know about its local network and there will also likely be limitations on the peers even being able to communicate with one another directly.

STUN and TURN allow the negotiation to include services outside the local network so that you can relay signaling and stream data to a set of endpoints that both peers can use to communicate with one another. This could be the peers directly or it could be a set of 3rd party servers run by Google or someone open to the public and you can't assume any particular network topology.


This article is essentially claiming that for WebRTC to use TCP you have to be using TURN. While I am willing to believe I am wrong here (our product feels a bit crippled over TCP to the point where I just recommend you not use it, so if it isn't working I might not even notice), I am nearly certain this is not true.


You are right, ICE does support TCP. Not all ICE implementations support it though, probably where the incorrect anecdote comes from.

ICE TCP did get a little more limited recently in Chromium though [0] because of the TCP port scanning issue [1]

I have also heard that some gateways/firewalls/$x don't allow any non-TLS traffic, so you can't even establish ICE. In those cases the DTLS/TLS transport of TURN is nice.

[0] https://lists.w3.org/Archives/Public/public-webrtc/2020Feb/0...

[1] https://medium.com/tenable-techblog/using-webrtc-ice-servers...


You are probably right. But sounds like this would still be better solved by this light tweak to webrtc than by a new protocol.


Exactly. If WebSockets were an option, then WebRTC will be pretty trivial and absolutely doesn't require any STUN/TURN.

(The main issue, and this has nothing to do with the client API, is that WebRTC implementations tend to end up assuming unique ports for each user--which would be needed to help with NAT--but if you aren't behind NAT then the ICE layer already has a connection ID so you should be able to multiplex them all over a single open port.)


One big issue was also being able to demux DTLS traffic. You could do it off the 3-tuple of the remote host, but that would fall down sometimes.

I am really excited for DTLS connection ids[0] to land. Then you will have everything you need to run ICE+DTLS (and SCTP over that) and be able to demux/load balance it easier.

[0] https://datatracker.ietf.org/doc/draft-ietf-tls-dtls-connect...


This is simply not true. Unless things changed widely since I used WebRTC, this connects to nothing.

Leaving out the ICE will of course make WebRTC look palatable.


There pain point is running ICE, DTLS, and SCTP on the server.


That sounds like you need better software then :) Realistically what do you think the timelines are for QUIC?

When Bernard published the QuicTransport stuff I tried a few different versions and it only worked aioquic[0] (which is a really fantastic implementation)! But with 29 drafts and most servers not supporting them all feels like we still have some time to go.

So QUIC as a server is much less likely to happen then ICE/DTLS/SCTP which have implementations that work everywhere.

[0] https://github.com/aiortc/aioquic


> That sounds like you need better software then :)

No, it rather sounds like you have a hammer called WebRTC and you're attempting to use it on a nail called "bidirectional server/client communication".

Why deal with the complex protocol stack of WebRTC which solves, among other things, NAT traversal, mutual authentication and encryption independently of server certificates, multiplexing of data and A/V content on a single port and much more. And I say that as someone who absolutely loves WebRTC for A/V and secure P2P use cases.

There is a true gap of "UDP for the web", which this fills.

"WebSockets over UDP" don't need to be built on QUIC, but I'm assuming by the time you have added all the security features needed to make this as secure as WebSockets for web apps, you'll effectively end up with something equivalent.


I have a hard time understanding what metastream is... I saw a demo of being able to play a YT video and chat at the same time, invite ppl, etc... Does it support sharing any video stream playing on a browser? (say, netflix). How does it work?

EDIT: bizarrely, I found this article on The Verge more descriptive of what metastream does than the website, WIKI and source code :-) [1].

1: https://www.theverge.com/2020/3/25/21191604/watch-movies-fri...


WebRTC is pretty hard to get right for client/server (it is designed for P2P), and may exhibit P2P downsides even when used for client/server (obnoxious NATs). If "dumb" client/server UDP doesn't work, your internet likely has bigger problems. Then there's the RFCs you have to implement to build it, which are more niche than you'd think. You can't do the entirety of WebRTC in Rust right now, without implementing multiple massive RFCs yourself.

It's a simpler spec to solve a simpler problem. WebRTC is necessarily complex, but still overkill for a bunch of scenarios.


I did a presentation last year that sort of answers that from a gaming perspective (gaming was the topic of the workshop):

https://vimeo.com/350908362


How is this expected to work for apps that having a webserver fronting the app that terminates TLS/QUIC/HTTP3 and proxies over fastcgi/http1.1/h2c for example?

Specifically, how would this work for say, a PHP app that wants to stream stuff? I use https://reactphp.org/ for websockets in PHP (so I can have a shared codebase with my main app, etc).

I don't think I understand how this would look on the server-side. I'm interested to understand how it would look like to support this in Caddy (I'm one of the core contributors).


If you need real-time both-ways communication in the browser you could look at Comet-Stream (since IE7 is now in all practical senses gone). It's simpler, consistent and scales like a monster on the server:

https://github.com/tinspin/rupy/wiki/Comet-Stream

The only thing that is annoying in the browser is that Chrome does not allow you to remove or change the User-Agent header which wastes a little bandwidth.


What is the benefit of this over WebSockets? More generally, what is it even, and what problems does it solve?


It uses HTTP/1.1 so it goes through all firewalls and the server implementation is simple enough for it to be rock solid and it scales with "joint parallelism" which only Java can do because of the complex memory-model and VM.

Rupy is a complete beast in terms of multi-core power, as long as the selector thread is not saturated it can scale linearly on all cores on shared memory = no memory copying or locks like all other solutions!

But the real upside is you get one process for async. HTTP (client and server), including database; which means you can use micro services the way they should be used by hosting all services on all machines and call them locally = more robust and less complex (no discovery after the initial client -> server DNS) completely without IO-wait!

WebSockets are completely over-engineered, as is HTTP/2 and 3...

So to answer your question: rupy is the final solution to internet servers. ;)

Now I'm working on the final solution for internet clients: http://talk.binarytask.com/task?id=5959519327505901449

And internet 2.0: http://radiomesh.org

And the final computer: http://talk.binarytask.com/task?id=8015986770003767235


What do you mean by joint parallelism?


Most parallel systems are embarrassingly parallel = they are trivial to distribute in the first place = you could run them on separate machines.

Joint parallel is the opposite of that, where you need very fast shared memory between the cores. Meaning all cores touch all data, it's a rarer form of parallelism because it is hard.

But if you want to write a MMO server f.ex. you need to understand how this works all the way down to the hardware.


Ok. I think fine-grained parallelism is the established term for this.


Partly, but "fine-grained parallelism" refers to how small you can make the sub-parts of the parallel execution and then adds "Another definition of granularity takes into account the communication overhead between multiple processors or processing elements." which is confusing! t compute / t communicate (where communicate probably does not differentiate between memory copy and waiting for memory due to a lock or cache-miss)?

According to that definition If you have a really tiny task that has fast memory (embarrassingly parallel + little memory) and a huge task that has really slow memory they are the same (1/1 == 100000/100000) so really what does that definition say!

Also it seems only instruction levels of parallelism are fine-grained. This means everything I will ever do and talk about is coarse-grained. So how can I tell people my software is 10x faster than Erlang for a MMO type of server problem space?

I'm going to access contiguous memory with 2 separate threads simultaneously without locks in C when I finalize my 3D MMO engine client this fall and then I'll understand more on how these things really work, right now I'm a little confused.

I think "Joint Parallelism " is more telling, it puts focus on the bottle neck of our civilization which is / and will always be: memory speed!

F.ex. Erlang is completely meaningless for "joint parallelism", because it uses memory copying instead of monitors/locks and even there Java can be even more performant according to the creator of the Java concurrency package:

"While I'm on the topic of concurrency I should mention my far too brief chat with Doug Lea. He commented that multi-threaded Java these days far outperforms C, due to the memory management and a garbage collector. If I recall correctly he said "only 12 times faster than C means you haven't started optimizing"." - Martin Fowler (https://martinfowler.com/bliki/OOPSLA2005.html)

I have mailed Doug to get an explanation, but the only thing I can tell you is that my implementation of what he talks about in that quote is proof that he is right. How he is right, I still don't totally understand! He hasn't and probably wont reply though!

More worrying is the memory copying that the kernel is doing, I think the last step for computing advances is to go back and simplify the OS.

My prediction is that we will get kernel bypass for network IO pretty soon and disk IO will follow after that, at least for servers.


finally a specification for datagrams, unordered and unreliable ! This was one of the big struggles of bringing mmo to the browser. DatagramTransport seems really promising as it provides encryption and congestion control which are the most complicated parts of any netcode. Good luck !


This was one of the big struggles to bring anything remotely time critical to the browser.

WebSockets only works for the mere basics.


SCTP already provides that in WebRTC.


But requires a lot of mucking around with STUN/TURN/ICE, which is far less desirable than just "send UDP packets to a browser".


You don't need to use STUN and TURN for client->server use cases, but you do need ICE, DTLS, and SCTP.


I think this is harmful for the internet because we need webrtc to become more ubiquitous, to keep it from being blocked at firewalls.

Also webrtc functionality is a superset of this so it's not a technically necessary protocol.


If anything, an increase in the use of QUIC will reduce the amount of UDP blocking, which would be good for WebRTC.


I think UDP is often whitelisted by port in these cases and rarely blanket blocked, as dns, VPN protocols etc are still needed. So quic, webrtc, and webtransport all have their own battles.


To my understanding, the ability to perform browser-to-browser communication with WebTransport API is not touched by the document. Is my reading right?


That's in a separate doc. Search for "RTCQuicTransport".


Really glad to see people excited about this!

If anyone is wondering if this is already implemented anywhere, we're currently experimenting with it in Chrome: https://web.dev/quictransport/ -- I'd be curious to hear what people think about it.


Could PJON https://github.com/gioblu/PJON be used as one of the underlying pluggable protocols? It would be cool to be able to use the browser along with an open-hardware physical network infrastructure.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: