Hacker News new | past | comments | ask | show | jobs | submit login
WebSockets vs. Server-Sent-Events vs. Long-Polling vs. WebRTC vs. WebTransport (rxdb.info)
518 points by bubblehack3r 10 months ago | hide | past | favorite | 228 comments



I've always had a bit of a soft spot for Server Sent Events. Just simple and easy to use/implement.


With ipv6 they can now be fully scaled easily but they are absolutely awesome, much easier to scale because you can give your client a simple list of sse services and its essentially stateless if done right.

Websockets get really complex to scale past a certain level of use.


> With ipv6 they can now be fully scaled easily

Any day now: https://www.google.com/intl/en/ipv6/statistics.html


What does ipv6 give you that virtual hosts don't?


In theory direct connections between all devices on the internet. In practice, everything's still going to be behind a firewall. But it's still an improvement over NAT, and hopefully we'll eventually get universally adopted protocols for applications to open ports.

I've been using IPv6 more recently, and one nice thing as a developer is being able to use the same IP address for local connections and internet connections. Simplifies managing TLS certs for example, since the IP address used by Let's Encrypt is the same one I'm connecting to while developing.


You misunderstand. I'm asking why IPv6 would help this specific situation, not why ipv6 is nice in general. None of what you said applies to this context.


You're right, I was responding to your comment directly and not taking context into account.

I guess maybe what GP is getting at is that with vhosts on IPv4 you need to have some sort of load balancer in order to share the IP, but with IPv6 you can flatten this out and give every host it's own IP?


If you have multiple machines with their own IPs, v4 or v6 makes no difference. If you have a single machine (vhosting), the number of IPs and their type makes no difference.


The difference is IPv4 addresses are far more expensive


I will quote the comment you are trying to justify:

> With ipv6 they can now be fully scaled easily but they are absolutely awesome, much easier to scale because you can give your client a simple list of sse services and its essentially stateless if done right.

If you don't understand it either please stop saying random stuff about IPv6 that we already know and has nothing to do with this thread.


I stand by my comment. I think the suggestion that scaling might be limited by IPv4 cost is a reasonable guess as to OP's concerns.


How is it limited? How is it less limited with IPv6?

A client gets an SSE endpoint (hostname). That endpoint maps to an IP. A server at that IP receives the connection. Which part is better with v6?

Are we talking about the few cents it would cost you to give each server an IPv4? Are we thinking about a distant future where that cost is not negligible compared to the cost of compute? Something else?


AWS is now charging about $3/mo per IPv4 address. If that's negligible for you, awesome. It's not negligible for everyone.


So that was the only argument. Wow, I'm glad I finally got to it, but I'm not amazed. All it took is making it up myself.

I don't use AWS so ok thanks. And if I did, I would use their ingress/gateway solutions, which totally circumvent this problem (while being quite expensive anyway).


I agree, AWS is way too expensive. In fact you have a good point which is that the cost of compute on AWS is significantly higher, such that the relative cost of IPs isn't as significant. You can get a solid VPS from Hetzner for the cost of an AWS IP address.

What's your preferred provider?


I don't feel it would be right to just give you an answer after chasing yours 11 messages deep and finally having to make it up myself. I don't have the kind of emotional energy it takes to communicate with you.


why dropbox when rsync"


I know what you're referencing but I really don't see why. What does IPv6 have to do with this?


Similitudes in timbre. There is plenty of room for improving the state of the web. Not 100% certain IPV6 is that, but it certainly offers more address space, and that would be foolish to avoid embracing simply because old tech is duct-tape-able enough.


I am not trying to say IPv6 is bad (in fact I'm a fan), I am asking what benefits it offers specifically in the context of SSE.

I am not the one making a claim, I expected some arguments with it. "IPv4 is not web scale" is not good enough https://youtu.be/b2F-DItXtZs


is anybody actually able to disable ipv4? maybe if you only serve vpn or internal users?

This might be the best thing about Elixir/Phoenix LiveView. I haven't actually had to care in quite some time :-) (though to be fair, I keep things over the websocket pretty light)


Yeah there's 6to4 schemes. It's common in the U.S. for cell providers to give IPv6 but private IPv4 and do NAT (although IPv4 could be skipped altogether)

AWS you can use NAT Gateways for 6to4 and do v6 only subnets


Also because they’re so simple you can use a CDN to scale them way more easily than you can WebSockets: https://www.fastly.com/blog/server-sent-events-fastly


> SSE connections keep a mobile device’s radio powered up all the time. You should avoid connecting a SSE stream on a device that has a low battery, or possibly avoid using SSE at all unless the device is plugged in.

Damn, that’s a huge downside


Same applies to websockets, but yeah.


But isn't your device’s radio powered up all the time anyway?


I think it’s on all the time but the phone has different power levels that it gives the radio, all this to optimize the battery usage.

So depending on how much the phone needs to utilizes the radio, the higher the power level is?

That’s just my theory though.


I don't think keeping sockets open waiting for incoming data have big impact on battery usage because there is no data transmission at that moment so radio shouldn't consume much energy in stand-by mode.

I use K9-Mail app for email working 24h a day, it has multiple accounts on different IMAP4 servers. You know, IMAP requires one keep-alive socket per subscribed folder and I have no problem with battery usage.


I don’t think IMAP is a good example here. Your email client will try both subscribing and classic polling. Subscribing is not a MUST. And from the point of view of end user, the difference when polling in a classical way is simply that the user will be notified later, so it’s hard to tell what the client really did in the background.


> way is simply that the user will be notified later, so it’s hard to tell what the client really did in the background.

I receive emails instantly. There is polling option in settings, I've disabled it.


I agree. Unfortunately you can only have 6 SSE streams per origin per browser instance, so you may be limited to 6 tabs without adding extra complexity on the client side.

https://crbug.com/275955


Is above still an issue with http2/3?

edit: From the article: To workaround the limitation you have to use HTTP/2 or HTTP/3 with which the browser will only open a single connection per domain and then use multiplexing to run all data through a single connection.


No, if you can enable TLS and HTTP/2|3, you are only technically using a single browser connection, onto which multiple logical connections can be multiplexed.

I think the article calls this out. There is still a limit on the number of logical connections, but it's an order of magnitude larger.


just use a service worker to share state, you would be much better off doing this anyways. saves a ton and is performant.


I think you need a SharedWorker for that rather than a service worker https://developer.mozilla.org/en-US/docs/Web/API/SharedWorke...


Shared workers (inexplicably) dont exist on Android Chrome

https://issues.chromium.org/issues/40290702


A service worker would work fine; the connection would be instantiated from the SW and each window/worker could communicate with it via navigator.serviceWorker.


That doesn't work because browsers have duration limits on ServiceWorkers:

https://github.com/w3c/ServiceWorker/issues/980#issuecomment...

Also unfortunately Chrome doesn't keep SharedWorker alive after a navigation (Firefox and Safari do):

https://issues.chromium.org/issues/40284712

Hopefully Chrome will fix this eventually, it really makes it hard to build performant MPAs.


In my experience, as long as a controlled window is communicating with the SW, the connection will remain alive.


HTTP 2/3 doesn't have they limitation.

For HTTP 1, simply shard the domain.


You can get around that limit using domain sharding, although it feels a bit hacky.


just one tab use SSE and others use storage event.


Can the tabs share a background worker that would handle that?


You can use https://www.npmjs.com/package/broadcast-channel which creates a tab leader, no need for a background worker

Edit: of course you could use: https://caniuse.com/sharedworkers but android does not support it. We migrated to the lib because safari took its time… so mobile was/is not a thing for us


Here's the chrome android issue for Shared Workers. Add your voice if it is something you need

https://issues.chromium.org/issues/40290702


Is that true if you are using HTTP/2?


Yes, but the limit is different (usually much higher) and negotiated, up to maximum SETTINGS_MAX_CONCURRENT_STREAMS (which is fixed at 100 in Chrome, and apparently less in IOS/Safari.)


Nope. That's only a problem with HTTP/1.1


The downside is that you have to base64 payloads or otherwise remove newlines.

I wonder why they didn't just a multipart streamed response.

Supports my metadata, very commonly implemented format


No need to base64 everything if you can just escape the new lines. Or you can use https://github.com/luciopaiva/binary-sse


Works with bog-standard Apache prefork and PHP.


don't forget the timeout reconnect!


Absolutely underrated.


SSE are really a subset of Comet-Stream (eternal HTTP response with Transfer-Encoding: chunked) only they use a header (Accept: text/event-stream) and wraps the chunks with "data:" and "\n\n".

But yes it's the superior (simplest, most robust, most performant and scalable) way to do real-time for eternity.

The browser is dead, but SSE will keep on doing work for native apps.


A few additional cons to be aware of:

WebSockets lack flow control (backpressure) and multiplexing, so if you need them you either roll your own or use something similar to RSocket.

Also SSE can't send binary data directly. You have to base64 encode it or similar.

WebTransport addresses these and also solves head of line blocking. But I'm concerned that we might run into a similar problem as we had with going from Python2 to Python3 and IPv6. Too easy for people to keep using the old version, and too little (perceived) benefit to upgrading.

As long as browsers still work with TCP, some networks will continue to block UDP (and thus HTTP3/WebTransport) outright.


> WebSockets lack flow control (backpressure) and multiplexing, so if you need them you either roll your own or use something similar to RSocket.

Yes, head of line blocking is an issue, but TCP provides flow control, and if you're not using that, you're going over HTTP3.

> WebTransport addresses these and also solves head of line blocking. But I'm concerned that we might run into a similar problem as we had with going from Python2 to Python3 and IPv6. Too easy for people to keep using the old version, and too little (perceived) benefit to upgrading.

At one time or another, one could have said the same thing about TLS transport, HTTP3, or XHR itself. Because of the comparatively huge domination of a few key browser engines, it's much easier to roll out new browser capabilities & protocols.

> As long as browsers still work with TCP, some networks will continue to block UDP (and thus HTTP3/WebTransport) outright.

By that logic, as long as browsers still work with HTTP 1.1 without TLS, some networks will continue to block HTTP 2 and TLS. While that's not entirely incorrect, the broad adoption of HTTP2 and TLS in particular suggests it's less of a problem than you think.


> Yes, head of line blocking is an issue, but TCP provides flow control

Unfortunately, the way browsers implement WebSocket it undermines TCP's flow control. It's trivial to crash a browser tab by opening a (larger than RAM) file and trying to stream it to a server using a tight loop sending on a WebSocket. WebSocket.bufferedAmount exists, but as of 2019 I failed to use it to solve this problem and had to implement application-level backpressure.

> At one time or another, one could have said the same thing about TLS transport, HTTP3, or XHR itself. Because of the comparatively huge domination of a few key browser engines, it's much easier to roll out new browser capabilities & protocols.

> By that logic, as long as browsers still work with HTTP 1.1 without TLS, some networks will continue to block HTTP 2 and TLS. While that's not entirely incorrect, the broad adoption of HTTP2 and TLS in particular suggests it's less of a problem than you think.

HTTP3 actually falls under my concern. There are still networks that block HTTP3, because it has really nice fallback to HTTP2/1.1, so there's no obvious impact on users.

So I guess the real question is will QUIC be an HTTP/2 or an IPv6, or something in between? Was HTTP/2 ever actively blocked the way UDP is? If so that certainly gives us hope.

The reason I care is that I'm currently developing a protocol that WebTransport is an excellent fit for. But I can't assume WebTransport will work because UDP might be blocked, so I'm having to implement WebSocket support as well, which is a lot more work.


> Unfortunately, the way browsers implement WebSocket it undermines TCP's flow control. It's trivial to crash a browser tab by opening a (larger than RAM) file and trying to stream it to a server using a tight loop sending on a WebSocket. WebSocket.bufferedAmount exists, but as of 2019 I failed to use it to solve this problem and had to implement application-level backpressure.

Before you were saying, "WebSockets lack flow control (backpressure) and multiplexing, so if you need them you either roll your own or use something similar to RSocket.", and now you're saying you can't roll your own? ;-)

> HTTP3 actually falls under my concern. There are still networks that block HTTP3, because it has really nice fallback to HTTP2/1.1, so there's no obvious impact on users.

Erm, HTTP2 & HTTP 1.1 have their own problems, some of which you yourself have identified. We actually rolled back from HTTP2 to HTTP 1.1 because of problems with HTTP2, particularly with mobile performance.

Our migration to HTTP3 has been all win so far. While UDP might be blocked for security reasons, there are security reasons to move to HTTP3.

That said, there are cases where only HTTP/1.0 is supported.

> The reason I care is that I'm currently developing a protocol that WebTransport is an excellent fit for. But I can't assume WebTransport will work because UDP might be blocked, so I'm having to implement WebSocket support as well, which is a lot more work.

I feel your pain. I've been using WebRTC, and the vast majority of the time UDP doesn't seem to be blocked anymore. That said, adoption of HTTP3 seems to be about half that of HTTP2 right now. Not bad for a brand new protocol, but I'd say we still have a significant amount of time to go before HTTP3 is the dominant protocol. I think the path forward is going to require some toil by the likes of you to support both, but no reason you can't support HTTP3 better! ;-)


> Before you were saying, "WebSockets lack flow control (backpressure) and multiplexing, so if you need them you either roll your own or use something similar to RSocket.", and now you're saying you can't roll your own? ;-)

You can, but you need to do it at the application level, which is a lot more involved. It would be nice if it were as simple as checking if bufferAmount > some threshold and then waiting on a promise before attempting to send again. That's essentially what you get with ReadableStream and WritableStream, which are provided by WebTransport.

> Erm, HTTP2 & HTTP 1.1 have their own problems, some of which you yourself have identified. We actually rolled back from HTTP2 to HTTP 1.1 because of problems with HTTP2, particularly with mobile performance.

Not sure if we're actually disagreeing here. In any case, we can both agree HTTP2 is not a panacea, and actually worse than HTTP/1.1 in some cases.

> That said, adoption of HTTP3 seems to be about half that of HTTP2 right now. Not bad for a brand new protocol, but I'd say we still have a significant amount of time to go before HTTP3 is the dominant protocol.

Here here. Overall I'm bullish on HTTP3 in the long run. I really just hope random enterprise networks don't decide to block it. In any case, it's going to be a big win for the places where it works.


This library allows for using Streams in SSE

https://github.com/rexxars/eventsource-parser


HTTP/2 traffic was looking essentially the same as HTTP/1.1 + TLS.

It wasn't particularly interesting to block to begin with. UDP if blocked, then it's blocked in a name of Security.


> As long as browsers still work with TCP, some networks will continue to block UDP (and thus HTTP3/WebTransport) outright.

I keep hearing it, but I've never actually seen such a network. There are many things that run on UDP. I can see it being closed in some tiny offices (but those usually lack brain power to accomplish it) or some dystopian corporate offices you can only see in a movie.

I really don't see how the fact that some networks might ban UDP has anything to do with it. Some networks ban google.com and wikipedia.com, you don't see them failing.


In their excellent article on NAT traversal[0], Tailscale mentions that the UC Berkeley guest wifi blocks all outbound UDP except DNS.

EDIT: According to this[1] issue, Berkeley guest wifi allows port 443, which would solve HTTP3/WebTransport. That's certainly hopeful, and I hope all networks are like this in the future. It's just not a forgone conclusion yet.

[0]: https://tailscale.com/blog/how-nat-traversal-works#have-you-...

[1]: https://github.com/danderson/natlab/issues/1


That's a case of shipping it 'broken' for those people.

If it works literally everywhere else, and the product is valuable, they'll change.


Not a traditional network, but Tor doesn't support UDP.


WebSocketStream is about to ship in Chrome, which adds backpressure: https://chromestatus.com/feature/5189728691290112


Are Firefox and Safari planning to support it?


There is better approach to base64: binary-sse (https://github.com/luciopaiva/binary-sse)


> As long as browsers still work with TCP, some networks will continue to block UDP (and thus HTTP3/WebTransport) outright.

HTTP2 should still work in that scenario then you don't need to worry about multiplexing.


You can't use HTTP2 directly in the browser. Or are you referring to something else?


All browsers released since the end of 2015 support HTTP2. If the server supports it the browser will use it. Unlike HTTP3 this is all TCP.

If you connect to a WebSocket over an HTTP2 connection then you don't need to worry about multiplexing since you can rely on the browser doing it for you - HTTP2 connections support over 200 concurrent streams.


I think we're talking about different types of multiplexing. Are you referring to the browser domain connection limit? HTTP/2 does essentially solve that.

I'm referring to opening multiple data streams on a single connection, which is very useful in a number of contexts. This is supported by HTTP/2 and HTTP/3, but it's only exposed directly to the browser runtime through WebTransport.


HTTP 1.1 did not support multiplexing and browsers limited you to six connections per domain. That made opening multiple WebSockets prohibitive so you would generally need to implement your own multiplexing of logical data streams on top of a single WebSocket.

HTTP2 and HTTP3 connections support the multiplexing of hundreds of streams across a single TCP or QUIC connection. Each stream can be a request or WebSocket. So now you can just open a WebSocket per logical data stream and leave the multiplexing to the browser.

So now you will have a single transport level network connection whether you use multiple WebSockets or implement multiplexing yourself atop a a single WebSocket.


Yeah we're talking about different things. Imagine you had a process in the browser that can respond to many concurrent long-running requests from the server. In order to handle this with WebSockets, you need to implement your own multiplexing, because a) the server can't initiate a new WebSocket connection to the browser and b) opening a new WebSocket connect for each request doesn't fit the paradigm of request-response very well.

Note that WebTransport handles this for you, because the server can initiate new streams.

If you're sending lots of data, you also need per-stream backpressure.


I suppose many use libraries built on top the native APIs, e.g. socket.io, so for them the change is nothing more than a bump in the lib version.


The article's information about WebRTC is not accurate. You can do client/server WebRTC without a "signaling server". Just make the server do the signaling. It takes a few extra round trips, but it doesn't need to be an extra server. And WebRTC data channels work quite well as a replacement for WebSockets or SSE, especially if you want to avoid head-of-line blocking. And there are many libraries that will do pretty much all of the work for you, like Pion or str0m.

I also think calling the WebTransport API complex is overblown. If you don't want the more advanced things, you can ignore them. If you want to use it like a WebSocket, just open one bidirectional stream and you're basically done. If you want to avoid head-of-line blocking, just open a stream for every message. It's a little more complex, but it's not the kind of thing you need a library for. Github Copilot will probably write the code for you. It's true there aren't as many server libraries out there yet, since WebTransport is still maturing. And we're waiting for Safari to add support.


> You can do client/server WebRTC without a "signaling server"

Huh. The signalling server is implemented in websocket, typically.

It cant be implemented in webrtc unless you propose a existing decentralization of existing clients to boostrap'


Or, if you’re building for clients with a traditional “enterprise” and “secure” IT infrastructure: add refresh buttons and call it a day. If there’s one thing in my experience that consistently fails in these environments and cannot be fixed due to endless red tape, it’s trying to make real-time work for these type of clients.


Jetty/CometD will fall back to long polling if other transports are not available.


Honestly, all the techniques for this stuff have their problems, including the refresh button.


True, though when carefully implemented it’s the most reliable option I guess.


I've switched to using SSE to get around problems with the refresh button. It's pretty simple and reliable.


That’s great. Unfortunately I’ve seen SSE fail quite a few times in the scenario I described.


My browser has a refresh button. Alas your application likely breaks when I use it.


In this threads context of sending specific application data from the server, I don't think this will happen.


Not entirely sure why it would break…?


Clearly I don’t know what your application is, but many heavyweight “web apps” don’t cope with a simple refresh, kicking back to a default screen or even login screen in some cases.


Also some apps ignore simple refresh and only react to hard refresh (ctrl-f5 and alikes). Refresh is as unreliable as other methods from user pov.


Sometimes this is to scope logins to single tabs for security reasons (I think that's why Userify does it that way). It's annoying but for infrequently used apps, no worse than getting logged out every three minutes.


client side state that isn't in the URL or local storage


websockets and sse are a big headache to manage at scale, especially backend, requires special observability, if not implemented really carefully on mobile devices its a nightmare to debug on frontend side

devices switch off network or slow down etc,... for battery conservation, or when you don't explicitly do the I/O using a dedicated API for it.

new connection setup is a costly operation, the server has to store the state somewhere and when this stateful layer faces any issue, clients keep retrying and timing out. forever stuck on performing this costly operation. it's not like there is an easy way to control the throughput and slowly put the load on database

reliability wise long polling is the best one IME, if event based flow is really important, even then its better to have a 2 layer backend, where frontend does long polling on the 1st layer which then subscribes to websockets to the 2nd layer backend. much better control in terms of reliability


I cannot agree more with you. I have seen people shot themselves on foot with Websockets and SSE. Long Polling even though is expensive, is it most explainable and scalable approach in my opinion.


SSE supports long polling. You can make the server close the connection whenever you want. SSE supports automatic reconnection, and will even include the last ID seen to let the server continue seamlessly.


It's important to remember that SSE won't automatically reconnect for quite a few HTTP status codes (i.e., upstream proxy outages like 50x error codes)


A lot of this was addressed in the linked article - rxdb has mechanisms to mitigate many of your concerns...


Not in the article by also relevant is short polling. While this does not send messages from a server to a client it can still be useful when all other options are not available (on shared hosting for example).

In my experience it even works great when the poll interval is long (for example 20 seconds) but when you also include the message list in each response. That way the client will be up to date when it interacts with the server: user presses a button -> the client sends a request to the server -> the server reponds with data and also a list of the latest messages.


It's also applicable for fast changing data where the proportion of polls that gets an update is high too.


To this day I still dont know why WebSockets and SSE dont support sending headers on the initial request, such as Authorization. Leaving authentication on realtime services entirely up to whoever is implementing the service.

I may be wrong here and the spec suggests a good way to do it, but i've seen so many different approaches that at this point might as well say there's none.


The EventSource API (the browser "client API" for Server-Sent Events) leaves a lot to be desired. While I am a maintainer of the most used EventSource polyfill[1], I've recently started a new project that aims to be a modern take on what an EventSource client could be: https://github.com/rexxars/eventsource-client.

Beyond handling the custom headers aspect, it also supports any request method (POST, PATCH..), allows you to include a request body, allows subscribing to any named event (the EventSource `onmessage` vs `on('named event')` is very confusing), as well as setting an initial last event ID (which can be helpful when restoring state after a reload or similar). And you can use it as an async iterator.

I love the simplicity of Server-Sent Events, but the `EventSource` API seem to me like a rushed implementation that just kinda stuck around.

[1]: https://github.com/eventsource/eventsource


Nice work! This addresses many of the issues I've had with SSE.

Another problem we've never worked out the solution to, is how to send a termination - signalling "there are no more events coming". We always end up having to roll our own, though it felt like something that should've been handled at the protocol layer.


Just read the spec

Clients will reconnect if the connection is closed; a client can be told to stop reconnecting using the HTTP 204 No Content response code.


> To this day I still dont know why WebSockets and SSE dont support sending headers on the initial request

Doesn't the initial request get to send a full set of standard HTTP headers, cookies and all?


It does, but if you're calling it from the browser you can't add arbitrary data to them (the way you can in e.g. a `fetch`)


Someone at Azure thought of this[1]

[1] https://github.com/Azure/fetch-event-source


> you can't add arbitrary data to them

What about intercepting the request with a service worker?


They do send cookies.


I do login the bog standard way with a regular old http request and the server responds with setting an http only cookie. Then I reconnect the websocket, which will then provide the cookie to the server on reconnect.


There's always TLS certificates... ;-)


Wait, what?? Been using these for years. Am I missing something?


The browser EventSource constructor does not have options to pass in your own headers. You can pass an option to have it use the cookies for the domain you’re using. There are libraries that allow you to pass in additional HTTP options, but they essentially reimplement the built-in EventSource object in order to do so. Not terribly difficult, fairly simple spec.


Well, that constructor by default sends all the headers you have for your own domain and auth you are entitled to. This is how all other APIs in browsers work due to security and privacy concerns.

If you call to other domains, then this problem is no different to what we had with CORS years ago.


> This is how all other APIs in browsers work due to security and privacy concerns

They're probably comparing it to the fetch and XHR APIs, which both allow custom headers.


They're probably referring to browsers specifically. The WebSocket constructor doesn't allow for headers


Oooh boy you touched a pet peeve. I mean who needs authentication on the modern Web right? /s

The even more irritating thing is that there is nothing preventing this, and every server I've tried supports it. It's only the browser WebSocket API that was designed without this. Cookies are the only thing browsers will deign to send in the initial request.


This is probably naive, but it seems like assuming HTTP/2 or better, an EventSource combined with fetch() for sending messages should be just as good as any other protocol that uses a single TCP connection? And HTTP/3 uses UDP, so even better.

(This all assumes you only care about maintaining a connection when the tab is in the foreground.)

I’m wondering what problems people have run into when they tried this.


One limitation is SSE is text-only, so you can't efficiently send binary data. You have to encode it as base64 or similar.


There is another alternative to base64: https://github.com/luciopaiva/binary-sse


This library does what you are looking for

https://www.npmjs.com/package/@microsoft/fetch-event-source


Was thinking exactly the same thing. H2 with SSE solves 99% of problems? I was wondering if we could push SSE even further along with lower latency, memory usage and CPU resources than doing something completely different.


This presumes the majority of your use case is server-client, but otherwise yes.


I always find articles like this amusing, because I designed an online auction system back in the late 90's. No XHR requests at all. Real-time updates were all handled with server-push/HTTP streaming. It wasn't easy to handle all the open connections at the time, but it could be done to an acceptable scale with the right architecture.


I've spent so many hours trying to communicate to folks the importance of HTTP streaming ... it's an uphill challenge for sure.

Yes, all the benefits of http/2 (or 3) are great, but we should also be aware of what we can take advantage of in http 1.1, especially since it's effectively universally supported.


Actually what he meant is not supported anymore: https://en.m.wikipedia.org/wiki/Push_technology#HTTP_server_...

Comet/sse/chunked transfer needs xhr to work. x-mixed-replace was würd back in the days and still is.

Edit: maybe you could also use an iframe/frame which holds a chunked connection but that will only give you text.


Chunked transfer encoding (which is indeed the underlying mechanism behind server-push) predates XHR, and AFAIK is still supported and will continue to be as long as HTTP/1.1 is still supported.

You can use a frame/iframes, but you can also just have content that is updated with multi-part MIME that doesn't cause the page layout to be redone.


x-mixed-replace Was the only mechanism that did that and chrome removed it and it is not standard and never was


Not only x-mixed-replace. You can do out-of-order HTTP streaming by appending chunks of HTML without any JS at all, thanks to Declarative Shadow DOM: https://lamplightdev.com/blog/2024/01/10/streaming-html-out-...


I'm so confused. Was x-mixed-replace the only way to use chunked transfer encoding (which is part of the HTTP standard) or are you talking about something else entirely?


actually you can use transfer encoding chunked to stream data , but it will get appended and the page would never finish loading, that worked back than it works today but that is not useful for several reasons and wasn’t back than.

with x-mixed-replace (as the name implies x-: experimentell) you can stream the page over and over again and the browser would change to the new version. (chrome still supports that for images, cheap webcams)

Tbf without frames neither mechanism made much sense, (even back than) because it would be horrible to use with form fields.

I started to play with the web in the early 2000s where xhr/long-polling/comet(via iframes, later it used xhr onreadystatechange with chuncked encoding, without sse, which basically was created because of that) started to gain traction and x-mixed-replace was extremely niche even back than because of the limitations it had on the page

Edit: comet (streaming script tags, inside an iframe) of course worked, back then. But I never heard of an implementation before 2006 (maybe a few years earlier like 2-3)or so, would’ve been worth a Wikipedia change if you would have old entries. Also http/1.1 was 97


> Tbf without frames neither mechanism made much sense, (even back than) because it would be horrible to use with form fields.

Frames were the common way to deal with that, but I even did stuff with having a separate "named" browser window.

> Also http/1.1 was 97

IIRC there was support for it in Netscape before it was really a standard.


> IIRC there was support for it in Netscape before it was really a standard.

That is probably true, since x-mixed-replace started the whole chunked transfer encoding.

It’s probably also what really triggered the invention of comet and later xhr. Netscape was way ahead and Microsoft just pushed it out with money, integration, activex? and of course unfair advantage.


> actually you can use transfer encoding chunked to stream data , but it will get appended and the page would never finish loading, that worked back than it works today but that is not useful for several reasons and wasn’t back than.

Oh Server Push has all kinds of issues with it. There are lots of good reasons to prefer the newer protocols for a lot of use cases.


Hello fellow traveler. ;-)

Nobody reads the specs anymore, and to a certain degree I can't blame them, as the protocols/standards have become quite complicated.


I kind of miss long polling. It was so stupidly simple compared to newer tech, and that's coming from someone who thinks WebRTC is the best thing since sliced bread.


SSE isn't really more complex than long polling. The only difference is the server don't close the connection immediately after sent the response. Instead, it wait for data again and send more response using the same stream.


One limitation of SSE compared to long polling (and WebSockets etc) is you can't efficiently send binary data such as cbor, protobuf, etc. Though if your long polling is chatty enough eventually the HTTP overhead will kill your efficiency too.


You can use binary-sse (https://github.com/luciopaiva/binary-sse) with minimal overhead.


Long polling is more amenable than SSE for most HTTP tools out of the box, eg curl. The SSE message body is notably different from plain HTTP responses.

To the OP, you can still build APIs with long polling. They are uncommon because push patterns are difficult to design well, regardless of protocol (whether long-polling, SSE, websockets, etc).

Whiteboarding a push API is a good exercise. There is a lot of nuance that gets overlooked in discussions whenever these patterns come up.


Well I know I can write applications use it, but I don't often write code outside the context of a team that has other opinions anymore :)


Agreed - I see SSE as basically a standardized approach to modern long polling


Oh, if only it were that simple.

The networking that makes Second Life go uses long polling HTTPS for an "event channel", over which the server can send event messages to the clients. Most messages go over UDP, but a few that need encryption or are large go over the HTTPS/TCP event channel.

At the client end, C++ clients use "libcurl". Its default timeout settings are not compatible with long polling. Libcurl will break connections and make another request. This can result in lost or duplicated messages.

At the server end, Apache front-ends the actual simulation servers, to filter out irrelevant connection attempts (Random HTTP attacks that try any open port, probably). Apache has its own timeouts, and will abort connections, forcing the client to retry.

There's a message serial number to try to prevent this mechanism from losing messages. The Second Life servers ignore the serial number the client sends back as a check. Some supposedly compatible servers from Open Simulator skip sequential numbers.

The end result is an HTTPS based system which can both lose and duplicate what were supposed to be reliable messages. Some of those messages, if lost, will stall out the user's activity in the game. The people who designed this are long gone. The current staff was unaware of how bad the mess is. Outside users had to find the problem and document it. The company staff has been trying to fix this for months. It seems to be difficult enough to fix that the current action is to defer work on the problem.

So, no, long polling is not "stupidly simple".

The right way to do this is probably to send a keep-alive message frequently enough that the TCP and HTTPS levels never time out. This keeps Apache and libcurl on their "happy paths", which work.


My solution to broken connections has actually been to have relatively short timeouts by default, eg 10 seconds. That guarantees we have a fresh connection every so often without any assumptions about liveness. You can even overlap the reconnects a bit (eg 10 second request timeouts, but reconnect every 8 seconds) as long as the application can reconcile duplicated messages - which it should be able to do anyway, for robustness reasons.

Really, anytime there is any form of push (whether SSE, long polling, etc) then you need another way to re-hydrate to the full state. In which case you are nearly at the point of doing plain old polling to sidestep the complexity of server-driven incremental updates and all the state coordination problems that entails.

Of course with polling, you lose responsiveness. For latency-sensitive applications (like an interactive mmorpg!) then HTTP is probably not the correct protocol to use.

It does sound like Second Life has its own special blend of weirdness on top of all that. Condolences to the engineers maintaining their systems.


I've seen a bunch of timeouts / heartbeat / keep alive durations. I think it might have been Wireguard, but 25 seconds seems like a good number. Usefully long, most things that break are more likely to do it at ~30 seconds, and if there's an active activity push at 15 or 20 seconds with device wakeup then the keep alive / connection kill might not even happen.

Full Refresh; yes please, in the protocol, with a user button, with local client state cached client code and reloaded state on reconnect. Maybe even a configurable polling period; some services might offer shorter poll as a reason to pay for a higher tier account.


> with a user button

If the user ever has to push a "retry" button, the networking levels are very badly designed. Just because some crappy web sites work that way does not mean it's OK.


The user shouldn't _have_ to. However, a 'refresh state' (and validate state, more gracefully than a full kill and reload) button can be both helpful and psychologically reassuring.

It can also be very helpful for out of band issues, like ISP hiccups, random hardware failures, bitflips, etc.


> Of course with polling, you lose responsiveness.

No, that's the whole point of long polling. The server delays the reply until it has something to say. Then it sends it immediately.

The trouble here is middleware which does not comprehend what's going on and introduces extraneous retry logic.


Sorry, that part of the comment was probably not clear - I was comparing "plain old polling" (stateless request-reply with no delay) with "push", ie long polling


> The Second Life servers ignore the serial number the client sends back as a check. Some supposedly compatible servers from Open Simulator skip sequential numbers.

I mean, if you're not respecting long polling, of course long polling doesn't work. That's like complaining that http doesn't work because your networking stack doesn't look at port number and distributes packets randomly to any process.


>I kind of miss long polling. It was so stupidly simple compared to newer tech, and that's coming from someone who thinks WebRTC is the best thing since sliced bread.

I still use it all the time. There are plenty of applications where the request overhead is reasonable in exchange for keeping everything within the context of an existing HTTP API.


You can still use long polling with HTTP/2 nowadays, it isn't going nowhere.


Jsonrpc over websockets is underrated tech. Simple, easy to implement, maps to programming language constructs (async functions, events, errors) which means code looks natural as any library/package/module usage devs are used to, can be easily decorated with type safety, easy to debug, log, optimize etc, works on current browsers and can be used for internal service to service comms, it's fast (ie. nodejs is using simdjson [0]), can be compressed if needed (rarely the need), we built several things on top of it ie. async generators (avoids head of line blocking, maps naturally to typescript/js async generators).

[0] https://github.com/simdjson/simdjson


It's been a while since I've used websockets, but at least the last time I did, "simple" wouldn't be the word I'd have used. All kinds of annoying issues between different browsers. SSE was generally much simpler.


Must have been time when spec wasn't stabilized and browsers have been introducing it. Those times are long gone.


I don't know about long gone. I still support a websocket solution that needs to have server-push as a fallback because of browser incompatibility issues.


It shouldn't be a problem for about a decade [0]?

What's your hit rate for the fallback?

[0] https://developer.mozilla.org/en-US/docs/Web/API/WebSocket


The hit rate for fallback is pretty low. It was less than 5% the last I checked.


This 5% includes or excludes non-human traffic?


Includes.


Do you have some specific examples?


Old Android handsets seem to be the big ones.


(I work at Stream, we power activity feeds, chat and video calling/streaming for some very large apps)

You should in most cases just use websockets with a keep-alive ping every 30 seconds or so. It's not common anymore to block websockets on firewalls, so fallback solutions like Faye/Socket.io are typically not needed anymore.

WebTransport can have lower latency. If you're sending voice data (outside of regular webrtc), or have a realtime game its something to consider.


I'm making a WASM browser dungeon crawler game using WebTransport. It currently does not have great support -- namely Safari -- but because of other API incompatibilities I'm not planning on supporting Safari :P

WebTransport is a bit more work than other ones, like SSE, but the flexibility and performance make it work it IMO.


A decent number of corporate firewalls still don't support web sockets...

That means if you build something that requires web sockets, prepare to have a deluge of support/refund requests from the most valuable clients who think your site is broken.

I suggest just having a once-per-second polling fallback, perhaps with an info bar saying 'the network you are connected to is degrading your experience'.


Certainty this can’t be true? I believe you but do you have any actual examples?


All UK government offices doesn't seem to allow it... That's a couple of million potential users right away.


Just use SignalR. It'll automatically choose whatever comes through.


Yes it's true. At Ably we support websockets, SSE and comet fallbacks (simple long-polling and streamed long-polling). It's less and less common but there are firewalls that fail to handle websockets correctly, or simply block them. I can't name specific companies/examples, but call centers are one example - the network and desktop environments are fully locked down.

We also see in these cases that streamed HTTP can also be broken by the firewall - for example a chunked response can be held back by the firewall and only forwarded to the client when the request ends, as a fixed-length response. Obviously that breaks SSE and means you can't just use streamed comet as a fallback when websockets don't work.


Wild, thanks!


This is all addressed in the article


Hello, I am author of https://github.com/centrifugal/centrifugo. Our users can choose from WebSocket, EventSource, WebTransport (experimental for now, but will definitely stabilize in the future). WebRTC is out of scope as the main purpose is central server based real-time json/binary messaging, and WebRTC makes things much more complex since it shines for peer-to-peer and rich media communications.

What I'd like to add is that Centrifugo also supports HTTP-streaming – not mentioned by the OP – but this is a transport which has advantages over Eventsource - like possibility to send POST body on initial request from web browser (with SSE you can not), it supports binary, and with Readable Streams browser API it's widely supported by modern browsers.

Another thing I'd like to mention about Centrifugo - it supports bidirectional WebSocket fallbacks with EventSource and HTTP-streaming, and does this without sticky sessions requirement in distributed scenario. I guess nobody else have this at this point. See https://centrifugal.dev/blog/2022/07/19/centrifugo-v4-releas.... Which solves one more practical concern. Sticky sessions is an optimization in Centrifugo case, not a requirement.

If you are interested in topic, we also have a post about WebSocket scalability - https://centrifugal.dev/blog/2020/11/12/scaling-websocket - it covers some design decisions made in Centrifugo.


I love how SSE is just a "don't close the connection and just keep flushing data". I bet IE6 supports this.


There are at least two polyfills for it, but I think most of them require at least IE10. (but IE6 has so many other issues that probably 90% of your other JS won't work anyway... so glad it's gone!)


I've been reading about WebRTC, does anyone actually know if browser to browser communication actually works reliably in practice ? Specifically NAT traversal, been hesitant to research it further because of this issue, seems that most of the connection parts seem to be legacy voip related protocols.


Works well! Lots of developers/companies use WebRTC with NAT Traversal.

You can also use it in a client/server setup. Check out 'WebRTC SFU'

I wrote a little bit about the different topologies in [0]

[0] https://webrtcforthecurious.com/docs/08-applied-webrtc/#webr...


Is there a modern open-source solution for bridging a traditional stateless web application to real-time notifications - one that's implemented all the best practices from the OP? Something like pusher.com but on self-hosted infrastructure/k8s, where messages from clients are turned into webhooks to an arbitrary server, and the server can HTTP POST to a public/private channel that clients can subscribe to if they know the channel secret.

I've come across https://github.com/soketi/soketi and https://centrifugal.dev/ but not sure if there are more battle-tested solutions.


I maintain the Mercure protocol (built on SSE) and the reference implementation (written in Go, available as a standalone binary and a Caddy module) which does exactly that: https://mercure.rocks

In addition to the free and open source server, we also provide a cloud offering and on-premises versions that support clustering using Redis Streams, Kafka, Pulsar or Postgres LISTEN/NOTIFY as backends.

The solution is used by many big actors in production for years:

How Raven Controls uses Mercure to power big events such as Cop 21 and Euro 2020: https://api-platform.com/con/2022/conferences/real-time-and-...

Pushing 8 million Mercure notifications per day to run mail.tm: https://les-tilleuls.coop/en/blog/mail-tm-mercure-rocks-and-...

100,000 simultaneous Mercure users to power iGraal: https://speakerdeck.com/dunglas/mercure-real-time-for-php-ma...


Another one to check out https://partykit.io

They are building on top of Cloudflare, and getting started is a breeze.

That being said they are also fairly new, but based on everything I have seen, I am a fan


NATS is another similar, very simple and powerful, option - particularly with their nats.ws library.

https://nats.io/


There's also Pushpin, if you want the API to blend with your existing app.

Disclosure: Pushpin lead dev.


Great comparison. Would love to see http response streaming added to the mix. I think a lot of use cases involving finite streams sent from server to client can be handled by the server streaming JSONL in the response. I tend to prefer this over SSE for finite data streams.


Server send event IS http response streaming. It is a standardized way to implement response streaming.


I agree, but SSE is a more complex protocol that requires additional handling on both client and server, with capabilities that may not be relevant for your use case (multiple event types, event id). For JS clients and JS servers it is not particularly onerous to implement, but for other ecosystems can require a fair bit of code. JSONL streaming is very easy to implement on both ends, so all else aside I think would be preferred if all you really want to do is stream JSON values.


I've implemented SSE from scratch on the server and XHR streaming/parsing from scratch on the client side (which would be necessary for JSONL), and SSE was way simpler. Unless there's another way to do JSONL in a browser that I'm not aware of?


If you use the fetch API you can get a readable stream and party on without too much difficulty. You can also implement a transform stream in ~10SLOC that will make the reader vend parsed JSON objects and can be reused easily.


This is a good point. In fact, you just helped me realize that I can probably replace this[0] at work with a fetch implementation. ReadableStreams weren't generally available across browsers when I wrote that. This would also allow us to return binary data if we so desired (XHR can handle binary but it can't stream it chunk by chunk). Thanks!

[0]: https://github.com/anderspitman/xhr-stream-dl


This library does that for SSE

https://github.com/rexxars/eventsource-parser


HTTP (POST) AJAX calls + SSE is one of the simplest ways to implement bi-directional real time functionality. IMHO this is a much more robust and nicer way than web sockets for a huge amount of applications which use web sockets today.


I did some testing with using SSE to send push notifications to my phone if someone set off a sensor, worked kinda good, but the browser had to be running in the background in order for it to work. After that i implemented a chat for a meme app that I've created to share memes with my friends, using websocket (Open Swoole) it is working nicely also. Never tested to see how many clients it can handle at once, but i guess the bottleneck would be in my server, not the software.

Open Swoole is very easy to setup and there's lots of tutorials online. Got my ass kicked a little bit trying to making my websocket secure (wss) but I'm the end it worked fine.


Browser push APIs are hard to design well regardless of the underlying protocol (SSE, Long Polling, Websockets, etc). There are a bunch of things to consider:

- Is this a full or partial state update?

- What if the client misses an update?

- What if the client loses connectivity?

- How can the server detect and clean up clients that have disappeared?

How those are answered in turn raise more questions.

SSE or long polling or even WebSockets is a relatively unimportant implementation detail. IMO the bigger consideration should probably be ease-of-use and tooling interoperability. For that, I would say that long polling (or even just polling) is the clear winner.


> Is this a full or partial state update?

It's a message. Mapping protocol units to messages was always your business.

> What if the client misses an update?

Sequence numbers on updates combined with a "fill in" mechanism through a separate request.

> What if the client loses connectivity?

Then more important things won't work either.

> How can the server detect and clean up clients that have disappeared?

The SSE client will restart dropped connections. You can have the server opportunistically close connections that haven't received messages recently. The browser will automatically reconnect if the object is still alive on the client side.

> For that, I would say that long polling (or even just polling) is the clear winner.

Coordinating polling intervals while simultaneously avoiding strong bursting behavior is genuinely not fun.


All that speaks to the point that the underlying protocol does not matter that much. The larger design is a more important consideration than the specific push tech.

None of this favors SSE, or anything else for that matter. You still have to think about bursting / thundering herd behavior after server restarts, which would affect any protocol. The browser auto reconnecting doesn't mean you can assume the connection remains alive indefinitely if a reconnect hasn't happened; you may want periodic notifications as a keepalive to enable more immediate recovery. Long-lived server state introduces its own distributed coordination and cleanup problems, which strict request-reply polling sidesteps entirely. Etc.

There is no silver bullet, only tradeoffs in the design space that need to be matched to your application's requirements.


> the underlying protocol does not matter that much

Then why is long polling the "clear winner?"


For interoperability with HTTP tooling (eg, curl) it is the clear winner


SSE, which is just an HTTP request, works perfectly fine with curl or any other library which makes HTTP requests.


Curl and most libraries don't unwrap SSE for you out of the box, you lose the one-shot request-response leading to nonstandard flow control in integrations, etc.

Sure you can find random libraries to help but they do need to be specifically built for SSE. Plain (long-)polling does not need any of that, which makes it more attractive from an API interoperability standpoint.


This is one of my favorite articles along these lines:

https://blog.sequin.io/events-not-webhooks/


Great article, thanks for sharing. NATS is an excellent option along these lines, particularly with their nats.ws library

https://nats.io/


Yeah that is a great article, and I agree that long polling an events endpoint along with a cursor gets you most of the way there for browser clients. At that point, things start to resemble the Kafka protocol.


What I want to know is, on iOS can we have Web Workers running for a few hours when the browser is in the background, with setInterval communicating w the network or will they get suspended?


iOS apps can't really do this so I'd assume browser apps can't either. There are specific background scheduler APIs you need to go through on iOS to do this in which the OS decides when to run these tasks and you have very little control.


The article is from rxdb, which has various mechanisms to handle such considerations


Like what ?


Don't assume reliability


Never had a good experience with anything using Server-Sent-Events, this especially goes for TFS, open a tab too many and all your TFS tabs just freeze


TFS?


Team Foundation Server probably.


> Long-Polling: The least scalable due to the high server load generated by frequent connection establishment, making it suitable only as a fallback mechanism.

That makes no sense. Long-polling scales linearly like all the other ones as well.


there are a few ways that long-polling is "least scalable"

- other approaches give you locality without sticky load-balancing: let's say your application server needs to subscribe to a topic once a connection is established, with long polling you need to setup and teardown that subscription every time, other approaches let you keep that HTTP stream alive and periodically send some stuff on it resulting in mostly just memory overhead.

- each returned payload will result in at least 1 extra packet (the initial request headers, assuming the response headers and the payload fit into a packet) and at least 1/2 RTT delay.


Man... I was trying to use WebRTC over ten years ago to implement livestreaming from your phone's camera *within a PhoneGap app*! Didn't work too well.


I wonder why mobile push notifications are just a side-note in this article as mobile clients are responsible for a large part of the global traffic.


Aren't mobile push notifications an entirely different tech, one that significantly uses the capabilities of mobile carriers?


I don't know about the implementation details, but even the use case is slightly different, as the discussed techs are also meant to transfer larger amounts of data. At the same time, mobile push is more about triggering synchronization than transferring large amounts of data.

However, since the discussed techs all have major problems with mobile connections, I still think it should have been discussed in more depth.


You can do mobile push notifications to a phone even if they aren't connected to a mobile carrier.

That said, I'd argue SSE is the browser equivalent of mobile push notifications.


SSE generally requires the page to be active in the foreground, which is not a viable solution for mobile.

Web Push is a distinct API for browsers to emulate native push notifications via service workers. https://developer.mozilla.org/en-US/docs/Web/API/Push_API


I thought it was more that the Document object had to stay alive. ...although, these days a lot of mobile browsers will kill that thing off if the page is not in the background, so yeah...


iOS and android push does not rely on mobile carrier capability afaik, it has its own service


Yes, iOS has APNS and Android has FCM. One can use FCM for both Android and iOS as FCM is capable of abstracting away APNS. If you are building on React Native with Expo, you can also use Expo Push, which abstracts away both APNS and FCM, albeit it is not fully featured.


SSE in gin (go) is broken and has been for years. No one uses it and no one bothers to fix it.


Good thing there's other ways to implement SSE - be it in Go or many other servers and languages...


r3labs/sse seems pretty good, but doesn't seem to handle backpressure or defend against slow-client attacks. (actually, I haven't seen any libs that do.. maybe this isn't a problem for anyone?) You could roll your own, since the protocol is extremely simple.


That's a really small data point. I bet there're many projects using Go with SSE.


Note that the author misses an essential point: custom compression dictionaries, which can only be used by WebSockets (WS13) and hardly by the others, as that would break compatibility. I'd argue that you can push websocket data usage way lower than the other protocols, if you use a binary compression based on a predefined schema.

In WebSockets, you can use plug&play extensions which can modify the payload on both the client and the server, which make them also ideal for tunneling and peer-to-peer applications.

I've written an article from an implementer's perspective a while ago, in case you are interested [1]

[1] https://cookie.engineer/weblog/articles/implementers-guide-t...


Will this still be the case now that Shared Dictionary support is being rolled out in Chromium?

https://developer.chrome.com/blog/shared-dictionary-compress...


[flagged]


> Whoever made this article is pretty clueless

This genre of HN comment irks me. Technical writing is an exercise in taking a lot of material and distilling it down to an audience. Writing that does a good job this will often look basic to someone with a deep understanding of the tech. It doesn't mean the author is clueless, it just means they decided they made different decisions on what to cut than you would have.


That the author didn't mention UDP "leaves out" the vs. objective, however. If it were a fair contest, the article would mention the use of WebRTC over UDP, which isn't just for streaming media (an assumption I had before doing a bit more research on it).

> This article aims to delve into these technologies, comparing their performance, highlighting their benefits and limitations, and offering recommendations for various use cases to help developers make informed decisions when building real-time web applications.

So, the article's intent was to help others. When we do things like this, we should ensure all technologies being evaluated will be covered exhaustively. Otherwise, you risk leaving out an important part of the puzzle and then assumptions kick in which ignore a possible better solution for a given use case.

It looks like UDP use is possible between a browser and a server, and that connection has to have components that deal with dropout, given it's UDP. There is a LOT to consider and deal with implementing UDP over WebRTC, so I put a dump of this up here: https://pastebin.com/xgA78dky


Author here. The article is mostly about web apps. How would your signaling server emit new connection updates to clients in the scenario you describe?


I think what they mean is that yes, the signalling needs to be done over "traditional" web APIs (websockets, etc), but that's just for discovering/negotiating the p2p connections. The actual data transfer between the peers then happens over UDP which can have a bunch of advantages over TCP for some scenarios.


The WebRTC section of the article seemed weird in general. The reason WebRTC doesn't specify signaling requirements is that clients can use any communications mechanism they'd like for signaling, and in the case where you are using WebRTC as sever-push mechanism, the "signaling" server and the server you want to receive pushes from could be the same server, allowing the use of "regular" HTTP as the transport for your "signaling" data.

With QUIC & HTTP/3, and things like RFC 8441 & 9220, you could well be using UDP with non-WebRTC protocols, and TCP stacks & routers tend to be pretty well tuned these days, so UDP doesn't necessarily have much of an advantage in this kind of use case.

If you check out the benchmark the article uses, it specifically breaks out using "unreliable WebRTC/WebTransport". The "unreliable" looks to be referring to UDP (I despise how people misleadingly associate UDP with "unreliable"). They also have "reliable WebRTC/WebTransport", which appears to be using TCP. In the latter case, they actually found in some cases WebTransport doing a tad better in the face of packet loss, which is interesting. I haven't looked at the details of the tests, but in my experience benchmarching WebRTC is not as straightforward as one might expect; it's entirely possible the nature of the benchmark itself is leading to WebRTC's better & worse performance.


Does WebRTC still ignore web proxies?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: