If you're looking to do "pure" P2P in the browser, we're not there yet as we cannot accept incoming connections without something in between right now.
This isn't a limitation of PeerJS or event WebRTC. This is a limitation imposed by IPv4 implementations that include NAT and PAT. These services in the middle are there to create tunnels as a mutually available broker for all concerned end points. Even after the connection is established additional third party services are still required for authentication, identity management (trust/certificates/keys), and possibly codec resolution.
If all parties were IPv6 first they would still need the secondary services mentioned above, but they could likely broker a direct peer-to-peer connection without a separate service provider.
It very much is a limitation of WebRTC. Real sockets perform better than webrtc since they can attempt NAT traversal or talk to the CPE to perform port forwarding, something that you can't do from a browser. Otherwise p2p filesharing wouldn't work. You may not get perfect connectivity this way since not all NATs can be bypassed but on the other hand webrtc does not work at all without signalling.
Sockets, TCP sockets, are TCP packets in a stream of a given format. TCP is layer 4. IP is layer 3. Other socket protocols are layer 7 or an implementation of UDP.
That being said there is no bypassing of NAT, but you can tunnel through it provided an open connection from within the network. For example HTTP cannot connect to a computer inside a NAT network but it can send a response to a request from inside that network.
Most P2P services make use of a central server to connect users no differently from instant messaging, and thus these services are generally considered a backdoor into a network. BitTorrent is different because it makes use of two protocols doing different things simultaneously.
The approaches don't always work, but often are sufficient for consumer grade routers.
> For example HTTP cannot connect to a computer inside a NAT network but it can send a response to a request from inside that network.
Well yeah, that's HTTP. It's a client-server protocol. There is no coordination in connection setup. And it's TCP. It's not designed to do the necessary gymnastics to establish incoming connections through a NAT. So HTTP being HTTP is not an argument for NAT traversal being impossible.
a) it's much easier with UDP where you can multiplex incoming and outgoing traffic over a single socket and it'll work for any remote peer as long as your UDP NAT uses EIMs.
If you do your homework you quickly find out that a signaling server is required (it’s written into the WebRTC spec itself) and that STUN/TURN is needed for NAT traversal. You should use your common sense and not want to lean on public shared offerings of either of those IMO, and the PeerJS docs do call this out
Right, it’s called a signaling server, which is mainly what PeerJS seems to be. You need a method of connecting peers through the signaling server, they decided on generating uuids that are shared out of band or through a peer discovery api. You can do this pretty much however you want, the spec doesn’t clarify at all by design.
No affiliation with them, but we've had decent success with Twilio hosting our STUN/TURN (https://www.twilio.com/stun-turn). We still host the signaling server ourselves since it has negligible bandwidth/cpu requirements.
Yeah, and a bunch of other caveats, hence my ask to look into the specific technologies to understand more how it works, and why it works the way it works.
And while it's true that you don't strictly need a server for signalling, most people would expect a library called PeerJS to work P2P over the internet, without requirements to have a speaker/microphone and also without having to be within speaker/microphone range.
WebRTC is complicated, it can be frustrating to get into!
If you have a chance take a look at https://webrtcforthecurious.com it is a CC0/Free book I am writing. Would love your opinion and if it helped at all. I also try and write https://github.com/pion/webrtc in a way that others can learn from it. I put all the specific tech in different repos so people can see the big picture.
Yes, one huge challenge of P2P over the internet is piercing the NAT veil of ignorance. WebRTC hides this complexity, but I found it highly informative to read about how NAT traversal works even if I'll never touch this mess in my line of work -> https://tailscale.com/blog/how-nat-traversal-works/
Reliable NAT traversal is what propelled Skype to fame and billions back in the day, and was later codified by the VoIP folks at the IETF into STUN/ICE and TURN. WebRTC is basically a VoIP client with JS API built into the browser.
While it does go down from time to time and can't be relied on for a serious product, I've had a couple fun projects using that server for nearly 5 years now, without having to touch anything.
It's still great that they provide this service. I'm curious if anyone else is hosting a slightly more reliable public server as alternative.
When you say "we're not yet there" what do you think would be the best step forward? Currently you need a server in the middle because your home router isn't opened up to the public internet. Would that be the answer? Opening up your routers to the internet?
I don't see one myself; at the very least, you need a signalling server. Under ipv6, there is less NATing involved, but you're still going to have problems with firewalls.
That's where the signaling comes in, the "caller" generates an "offer" and this has to be relayed to the other end(s). from there, the clients take that and generate an answer that links the two.... the offer / answers include all of the ip info the machine can generate, but on a nat network that's really just only as far as your local gateway, your ipv6 addresses, etc...
you need a STUN server to add in the parts that would include your external ip... that's really all stun does is some service further up the network to give a more reachable address..
Tested this a couple of years ago and Chrome prevented WebRTC connections from ever establishing if you're not connected to internet proper (it was doing some lookup to a Google domain before trying to actually create the WebRTC connection, even if I had nothing to do with Google, and stopped the attempt if it couldn't connect). Might have changed by now.
That might have been Google's STUN server? If this was required at one time this is now fixed! getUserMedia requires a secure domain (HTTPS or localhost) so LAN only video calls are kind of a PITA.
You can easily do DataChannel only. For dev work I have a signaling server on a Class C address. Just for testing interop on Safari with my Linux box running signaling and WebRTC agent.
If memory serves me right it's a security issue, webrtc only allows connections when proper ssl certs exist on the initial brokered handshake, so you would only be able to do this if your lan has some sort of self signed ssl root cert and the cert placed in each machines authoritative cert stash.
WebRTC doesn't use any public CA certificates to my knowledge. DTLS-SRTP uses self-signed certificates on both sides, with the certificate hashes or key fingerprints being included in the SDP signalling messages.
Maybe you're thinking of the "secure origin" requirement? (The page hosting the JS needs to be a "trusted context", i.e. no file:// and no http://)
You need at the very least, a mechanism to exchange ice candidates, data on which ip/port/protocol are available on each peer.
So if u had a web server on that lan, yes. Otherwise u need a mechanism like QR codes or sneaker net a floppy disk with the details to the other peers.
This is the assumption I had as well last time I tried it, but it's not like that, at least last time I tried it a couple of years ago. Might have changed by now.
I created multi party P2P audio/video chat in ~200 lines of code a while back, here it is: https://github.com/ScottyFillups/p2pchat (see room.js and all the files under client/)
I love his work. His "webtorrent" is also great. I always wondered if anyone is using it on their website to lower the cost of distributing large files.
There's a big opportunity in the rtc space for someone to build a lib/service that "just works". We built a probject that started with peerjs for the prototype and then moved to simple-peer for the prod version, but it's still a nightmare in webrtc land. All sorts of random errors and quirks, connection glaring, reconnect handling, etc. This is still very much an unsolved problem.
We're a YC company that tries to do exactly that: WebRTC-based features that "just work", with APIs for everything you might want to do with real-time media:
https://www.daily.co/
As you noted below, the signaling itself is the easy part. The long tail of RTCPeerConnection-related corner cases, bandwidth management, analytics and debugging real-world user experience, scaling usage both horizontally and geographically, building features like recording, scaling meeting size beyond 4 participants, optimizing for specific use cases, dealing with browser/platform quirks ... we've tried to make life easy/(ier) for developers in all of those areas.
There's still a lot of room in the space of WebRTC troubleshooting tooling, precisely due to the complexity inherent of "doing it right".
I know well about the mess of difficult to debug technologies that form WebRTC... and on top of that you'll want monitoring, recording, reconnection logic when things go wrong. Of course, add the ability to seamlessly scale when there are lots of users joining a call. The list goes on and on even for a seemingly simple service that "just works"!!
I've been maintaining and improving the oss Kurento server for several years now (https://www.kurento.org/), and the kind of issues that users find is crazy. A slight misconfig in an intermediate proxy can cause random issues very hard to track down.
Shameless plug: We're now building OpenVidu (https://openvidu.io/) on top of Kurento, to provide some of these features. This is a tool that in turn (pun intended) eases writing videoconference services. Most people looking into starting their own WebRTC app from scratch with Kurento, would be better served by OpenVidu!
I worked at a job that was trying to do this. The issue they ran into was money.
* Signaling wasn't a big money maker. You are just exchanging small blobs.
* We spent a lot of time helping people debug their networks/explaining WebRTC and making SDKs. I enjoyed the work, but didn't feel like it scaled well.
* Adding features caused major paralysis. Everyone wanted different authentication or different signaling patterns etc...
Would you be open to having a brain dump chat? I want to deliver some WebRTC solutions in the near future but haven’t done this before, would be awesome to just pick someone’s brain about what happens after release :)
Sean and Pion are awesome! Definitely talk to Sean.
I'm happy to talk to you, too. I do a lot of "let's just talk about real-world things you'll encountering building video features" calls. You don't need to be a Daily (my startup) customer or even plan to be one. I'm kwindla at daily dot co.
Signalling is actually the easy part, as I'm sure you know. It was everything else that you take for granted when doing voice comms: auth, handshakes, reconnects, etc.
They all seem really simple, until you actually sit down to do it yourself. Mad props to the FaceTime crew...nobody else has mastered seamless RTC the way they have.
I’m curious, did you implement the perfect negotiation logic from the webrtc spec? I’m working on my own WebRTC applications and was hoping that would at least solve the glaring problem in practice
I tried PeerJS in a pet project a few months ago. The one thing that became immediate is peer to peer over browsers isn't great. You can basically throw testing out the window, there is no good way to emulate p2p locally. Trying to test my PeerJS project involved having two computers, and occasionally required one to be connected through a VPN. Problems I found was connecting over LAN was really flaky. Most times, Chrome to Chrome was fine. Chrome to firefox was also mostly fine. Firefox to Firefox had a pile of quirks - connections only worked if I completely quit firefox, and started it again with just my two windows, and even then, sometimes that connection failed.
The ICE/Turn server stuff is also a mess. I ended up installing and hosting a coturn server on digital ocean, and it was a pain. Documentation wasn't great, stackoverflow/forum questions were typically old, and there didn't appear to be any obvious alternatives.
I also tried simple-peer by feross, which seemed to be fine, and most quirks remained as browser issues.
I ended up ditching webrtc and went with a server with websockets. With peer to peer, I'm stuck using a server somewhere, so I'd rather use a websocket server. Webrtc just introduced too many unknown variables. Is there a problem with an ICE/Turn server? Is there a LAN issue? Can browsers vendors properly do p2p to each other without a bunch of "if firefox, do this, if chrome, do this, if something else, panic"? The websocket server took 100% of these concerns out of the equation.
I'd love to be able to use webrtc confidently, but with the current tools for implementing and testing, I don't think I can say it is for myself.
So I think the TLDR is that webrtc peer to peer isn't really consumer-ready. The browser support seems a little hit and miss, testing through a LAN appears to be very flaky, and testing via node/some other framework is spotty at best.
Disagree. You can implement functional tests with a loopback signal server. You can implement e2e tests with network emulation (e.g. using iptables/iptraf). Automated testing using the browser WebRTC stack (obviously run basic tests in Node.JS for convenience) is annoying but doable with puppeteer or the like. Just takes time and effort.
The point of WebRTC vs hairpin'ed websocket streams is that the bulk data doesn't pass through intermediate hosts, making the system less costly to run and more privacy preserving. Note that any TURN service required can always be provided or controlled by one of the parties -- it doesn't have to be a centralized service.
Sure, you can do all that, but then I think you're in docker-land or something similar, and you are essentially building a system that works the way you think it does, and it may not reflect reality (arguably that is most testing).
And I think if someone is to build this containerized system for testing, it shouldn't necessarily be some app developer like me, it should be someone like the chromium or firefox team that know exactly how the browsers expect to communicate
I’m curious, does your solution not need audio and/or video? I could agree that WebRTC is overkill if only data channels are used, but AV blows complexity up massively doesn’t it?
Audio and video would have been a nice to have, which is why I was going this route. I also wanted the data to never touch an external server so users could know their data is as safe as the people they share it with. The data for this app, by nature, isn't important to have a history, so just being able to share data immediately was all I needed, and p2p seemed like the optimal first step.
I cut out audio and video from my features, and for the most part, things are working just fine.
I’m a bit confused, are you using WebRTC for the peer to peer connection or not? If you’re not, doesn’t that imply a server in the middle that all data is sent through? Sorry don’t mean to be challenging, just want to understand how people work in this space!
I'm not anymore, because it was incredibly frustrating to test locally. My original goal was p2p, for two reasons. I didn't want a backend if I didn't need one, and I wanted to maintain a lot of privacy features for users. So now I just have a websocket relay server instead, which isn't my favorite, but it does the job.
Such a strange wording. Sounds like you're saying "It's too simple, we need to make it more not-simple" while anyone would agree that simple is better than non-simple. What if it can be simple, useful and scalable? That's the ideal I think.
No application implemented within a browser is "simple". It's obfuscated and hidden complexity of a scale so large than only mega-corps are capable of handling, and so chosing, which directions things go.
Simple would be a real application not subject to the constant churn and complexity of corporate driven web.
I used PeerJS for a project, http://subsect.net , that has been running for a few years. I have run my own PeerServer on Heroku all of that time without incident. I have tested connectivity over WebRTC from many places around the world and have never had a problem other than from the lobbies of some banks as it seems they generally block IP traffic.
WebRTC is a transitional technology and I look forward to the day blocks of IPv6 addresses are granted free to all people at birth and fixed addresses are supported on all devices.
Wow, this seems really cool. I've looked into WebRTC before and every time I try to read the specs to see if I can implement something, I was turned away by how difficult it seems to be to get everything going to do hello world examples. This seems to simplify all of that and I'd love to try it out some time.
Related question, does anyone know how heavy of a load is a STUN/TURN server? If I build a hobby project using this, and use a (for example) Heroku free tier dyno to host a STUN/TURN nodejs server, how many users/traffic can I expect to be able to run?
It’s only used to negotiate connections, not transmit data, so in theory the load isn’t constant or heavy. I don’t have experience running this in production yet, have just done tons of reading and a moderate amount of prototyping
OT: I was trying to use this for a glorified multi-stream mobile video casting to a Zoom streaming server. I used a nodejs instance which was very easy to set up.
But I could not find a solution for an in browser mobile weberc video solution. Pretty much all implementations went through an app.
I bounced off the documentation for STUN/TURN servers about a month ago when I was trying to make a chrome addon to stream a tab to a group of people who click a link.
I also got hung up on trying to use WebRTC with Next.js
Might be time to simplify my stack a little and try again.
Multiparty conferencing without a media relay is difficult, as bandwidth requirements scale quadratically with the number of participants for a full mesh topology.
You can also have the participant with the beefiest connection act as such a relay, but if every participant is on a metered or upload limited connection, that's not an option.
It's pretty sad actually: We seem to be caught in a perpetual catch-22 of "nobody needs public IP reachability and symmetric bandwidth for home connection, since everyone uses beefy cloud servers anyway" and "we need beefy cloud servers because home connections are asymmetric and NATs are horrible"...
cpu usage on each client is also an issue. Architecturally, WebRTC connections are always peer-to-peer, in the sense that each transport carrying media is negotiated individually and does its own bandwidth shaping.
This has some really nice properties. Doing the bandwidth shaping individually for each transport maximizes video quality for each track, but it also means that in an N-way call you have to encode your outgoing video N-1 times. Even with a perfect network connection, you run out of cpu to do the encoding at some point.
Today, on a pretty new-ish laptop the limit for how many outgoing videos you can encode (<waves hands about codecs and settings>) is ~10. On an older Android phone that limit is ~1.
It's possible to imagine changing how WebRTC works so that separate transports can reuse encoded streams. And hopefully that will become possible at some point (https://www.w3.org/TR/webrtc-nv-use-cases/). But not anytime soon. :-(
If you're looking to do "pure" P2P in the browser, we're not there yet as we cannot accept incoming connections without something in between right now.