I feel like there are so many pitfalls when designing this - is there something standard and trusted (would TLS work?) that you could build your application on top of?
I assume there's TLS in the server connection already, but the encryption here is to make the communication unavailable to the server for decryption, so "bare" TLS does not solve the problem.
With TLS you need pubkeys you can trust (the certificate authority hierarchy provides that trust for the open Internet) or you're vulnerable to MITM. You could potentially share pubkeys using a similar out-of-band mechanism to that currently used for symmetric key distriubtion, and tunnel that TLS connection through the server's shared comms channel. That would work OK for two parties, but it becomes significantly more cumbersome if you want three or more, since each TLS session is a pairwise key exchange. Notably, however, this would not transit secret keys through server-controlled web pages where they could be available to Javascript. Something like Noise [0] might also be useful for a similar pubkey model.
Unfortunately, this kind of cryptography engineering is hard. Key distribution and exchange is hard. There isn't much of a way around learning the underlying material well enough to find this sort of issue yourself, but misuse-resistant libraries can help. Google's Tink [1] is misuse-resistant and provides a handful of blessed ways to do things such as key generation, but I'm not sure if it's suitable outside of cloud deployments with KMS solutions. nacl/secretbox handles straight encryption/decryption with sound primitives, but it still requires a correct means of key generation [2] and distribution.
Agree. When people hear the adage "don't roll your own crypto", they often think it refers to crypto primitives only. In reality, it's also hard to design a secure crypto protocol, even if the underlying crypto primitives are secure.
I guess TLS has a dependency on the public key infrastructure (eg Let's Encrypt, or whoever issues wifey accepted certs). Which makes end to end encryption between users harder (most of this stuff is intended for server auth and encryption)?
But otherwise big +1 not to reimplement crypto when the are alternatives. Another option for secret key stuff might be ssh?
There is no requirement to use TLS with webPKI if you are making your own application (not the browser), you can use TLS with custom certificate mangement.
You still need to figure out how you handle trust and key authentication somehow, but that is true of all cryptographic protocols.
It would be hard to do end-to-end TLS (where the server proxies the raw connection) because
(a) you can't share one TLS connection to the host between multiple clients; if you wanted multi-client support while preserving end-to-end TLS, the host would need to maintain a TLS connection with each client and waste bandwidth re-uploading the same image
(b) there is no client software requirement, so you would have to do the TLS decryption clientside in the browser (maybe via WASM) unless you're OK with having viewers download software