The way it works is that the code phrases are use to do a key exchange (PAKE), which means that the only way they're useful to an adversary is if someone is able to MITM your connection and guess the passphrase. The presumption is that this is unlikely enough that you should see a bunch of incorrect passwords and abort, knowing something is wrong, before someone is able to successfully MITM you.
I admit I'd rather see a longer default passphrase too, but fortunately this is adjustable on the sender's end of the connection, so you can choose a longer one if you'd like.
Quickly looking, the word list is 1626 words, which means 1626^3 = 4 billion combinations. I don't know how it works in practice but I would guess the initial sending communication is always quite short-timed...
It can be enough for PAKE. Depends on what your requirements are.
Edit: to be more specific: 32 bits means that an attacker has a 1 / 4294967296 chance of guessing your password. They do not get multiple tries, because that's not how PAKE works. It's akin to agreeing with someone that you will meet in the park and exchange a short secret phrase to prove your identities, whereupon you will exchange GPG keys with each other or whatever. You don't need 128 bits of security for the "meeting in the park" exchange, any short unguessable phrase will do.
That's right. But you'd have to attack an extraordinary number of transfers to have even a small chance of managing to get one by luck, and an attack of that scale would very quickly become obvious since everyone would have their transfers interrupted. I agree with you in principle though, I'd like to see just a bit more entropy.
Increasing the passphrase to four words would bring the odds up to 1 / 6,990,080,303,376. In magic-tunnel I believe there is a flag to change the number of words used by default. It appears that schollz/croc allows you to use your own passphrase, but not increase the default word size, that would be a good feature request.
You're quite right, of course, but keep in mind that magic-wormhole is using a much smaller wordlist, and in fact only has 1 / 65536 odds by default. The people writing software in this space don't seem to believe this is a credible threat.
How would tab completion work in a situation like this? Are the clients exchanging information about the passphrase over the same communication channel?
It's not clear to me how this differs from Magic Wormhole, which is in wide use already. If it's a Golang implementation of the idea, there's already an interoperable Golang Magic Wormhole: https://github.com/psanford/wormhole-william.
I was inspired by wormhole to make croc. croc does not use the wormhole spec. I found it necessary to diverge from the wormhole spec so I could easily implement features that I find important. AFAIK I've* implemented features in croc that are still not available in wormhole, including:
- resuming transfers [1]
- ipv6 support [2]
- local transfer without public relay [3]
- simple installation for windows [4]
- sending folders without zipping (i.e. send in place) [5]
*some features, like ipv6 support, were generously tested/coded/inspired by others.
* I believe (someone else can confirm) that you can generate a single-binary `wormhole-william` for Windows that will interoperate with magic-wormhole.
* I believe v6 support works in wormhole as well.
How does your resumption feature work, cryptographically?
> How does your resumption feature work, cryptographically?
Files are collections of "blocks". This allows files to be read in parallel (reading block at position X) and written in parallel (writing block at position X). Resumption works by having the recipient determine empty blocks (positions filled with 0's) and then requesting the sender to only send required blocks (specified by positions).
This has a trade-off - there are no "progress" files needed (which might not get removed if you don't know about them and don't want to resume) but it also means you can't send a file filled with 0's.
> but it also means you can't send a file filled with 0's.
To be clear, this can't support a file that contains a block worth of 0s anywhere in the file? That seems like a severe limitation
Although, the recepient should have encrypted blocks, right? So its only when the output of the hash is all 0's that it should conflict with the resumption algorithm
> but it also means you can't send a file filled with 0's.
I think I was sleepy because I errored in saying that. You can send a file with 0's without a problem.
The only little bug is if you resume a transfer of a file that is filled with zeros then the recipient will re-request blocks it already has (because they are zeros) so that parts of the file get sent twice (worst-case scenario). It is a bug, but a small one IMO that has never manifested in my reality (via issues or my own use). Happy to accept PRs to fix it though!
> which might not get removed if you don't know about them and don't want to resume
The files can
1. Contain backreferences to the target files whose progress they represent (full path).
2. Be stored in a well-known location, so the program can sweep through all of them at start-up or shut-down and garbage-collect all those whose target file no longer exists.
To me this is the killer feature. It's not terrible to transfer files between different dev computers - the users probably have the understanding or software to make that happen.
magic-wormhole has most everything (currently its missing capabilities for multiple file transfers and file resuming), but it requires installing lots of the Python ecosystem which is tricky for non-developers (and Windows users)
Croc has binaries for windows available for download, I dont want to build myself. I dont want to depend on another third party client to use some random protocol.
With croc the clients are first party.
This seems very similar to Magic Wormhole which I've been using for years: https://github.com/warner/magic-wormhole. I'll admit the crocs README sells it better though.
the author's blog post on Croc credits magic-wormhole:
magic-wormhole has most everything (currently its missing capabilities for multiple file transfers and file resuming), but it requires installing lots of the Python ecosystem which is tricky for non-developers (and Windows users).
This is tech from alternative reality where we're not forced to use Teams, GApps and other shit. It's a reality where you "croc" a screencast via shareable croc URL, posted via matrix chat to your manager.
The claim is wrong, I didn't know about the rsync compression (-z option, wish it was defaulted). I ran a test using the rsync compression and rsync transferred a little bit faster (~4% faster). croc is faster than wormhole last couple times I tried though. I'll update the readme.
> Does this also apply for data that's already compressed?
Assuming that rsync runs close to line speed on your computer, then a 1.5x speed up would be by definition impossible for already-compressed data. Moreover, rsync already offers compression, so any improvement would have to be from multiplexing. Maybe that does get 1.5x on some connections, but I'd be skeptical until I saw some real data.
>Does this also apply for data that's already compressed
Of course not, but there are ways like multiple connections and threading (preventing blocking or slow io) that might increase throughput considerably, but normally? Nah.
I made a similar tool HelloEle which allows to upload/download files with file-indexes.
You can connect it to any AWS S3 API compatabile bucket (ex: AWS buckets, BackBlaze...)
I know very little about malware, but I tried clicking at @yakczar at the bottom of the page where it says "by all means you can send an Issue, a PR, ask a question, or tweet me (@yakczar)." and my antivirus software stopped me saying it was a potential threat (trojan). It points to domain ctt . ec (don't know if safe), different from all other usernames (github/). Seems to me there's something wrong, right?
This appears to be a link shortener service provided by "click to tweet dot com", which prefills some information into a Tweet for you to tweet at the author. It seems to be a convenience thing. Chances are your antivirus software is a little over-cautious and some malware has at some point used the link shortener.
The first port is for communication, the four other ports are for transferring data in parallel. You can setup a relay to use only two ports (one for comm, one for transfer) but in general I found that the parallel transfer give a little speed-up.
Good question. My impression of it is that it's a little janky at present, compared to Magic Wormhole.
Yes - there's a relay server, but you don't have to trust the relay. (As of changes made in March - before that Croc sent the key in the clear to the relay.)
The first three characters of the passphrase are used to establish a shared channel between the sender and receiver, and the rest of the key is used to do a PAKE, which is a secure method for key exchange. The default passphrase is only three words long, and according to another comment the wordlist is only 1626 words, so you should assume the first word of the keyphrase is entirely blown and useless for securing you from the relay. So that makes the space effectively 1626^2 = 2.6M passwords. To MITM you the relay would have to guess which of those passwords is correct.
Magic Wormhole works a bit differently. IIRC it dynamically requests a channel from the server which is prepended to the passphrase as a number. This seems a bit more resilient to me, as the birthday problem suggests it's quite likely two people will end up with the same channel just by bad luck with Croc.
Edit: indeed, a test forcing similar passwords beginning with the same three characters completely screws Croc up. I can do "croc send -c 'xyz blahblahblah' filename" on multiple computers without error, but croc receives go to the first channel even if they have the wrong password.
Edit: There are only ~900 different three letter word beginnings in the wordlist, so you'd only need to claim that many channels on the server to make it unusable for anyone who lets Croc pick passwords, and occasional collisions between users are certain. Lol that should probably be fixed.
> "croc send -c 'xyz blahblahblah' filename" on multiple computers without error
That seemed to be a bug and is fixed now. After two parties enter a channel it should give you a "room is full" response.
> Magic Wormhole works a bit differently. IIRC it dynamically requests a channel from the server which is prepended to the passphrase as a number. This seems a bit more resilient to me, as the birthday problem suggests it's quite likely two people will end up with the same channel just by bad luck with Croc....There are only ~900 different three letter word beginnings in the wordlist, so you'd only need to claim that many channels on the server to make it unusable for anyone who lets Croc pick passwords, and occasional collisions between users are certain. Lol that should probably be fixed.
I think wormhole uses only 99 channels [1] so it is also susceptible to a DOS attack. But generally, collisions in channels can occur but are probably pretty rare because I don't have enough people using croc simultaneously to collide. Importantly - colliding channels doesn't undermine security because you have to have the whole passphrase to successfully perform PAKE. But, when you say it should probably be fixed, you refer to mitigating a DOS attack?
That's likely the case, it looks like I had an old version installed in Termux. But in any event that's just a matter of getting an intelligible error message, and not fixing the underlying problem.
> I think wormhole uses only 99 channels [1] so it is also susceptible to a DOS attack.
In that case I'd certainly have the same concern.
> But generally, collisions in channels can occur but are probably pretty rare because I don't have enough people using croc simultaneously to collide.
I disagree. Suppose you only average 2 users at a time. In this case ~1/630 connections will randomly collide, just by chance, which means that some people have certainly experienced this already. IMO that's too high, and in any case the relay code certainly shouldn't be written to only support <10 channels. (Note that the number of collisions will actually be higher than my back of the envelope math suggests, because some 3 character prefixes will be more common than others since you're using English words.)
Importantly - Wormhole doesn't have this problem, even though it technically has a lower channel limit, because it allocates channels dynamically instead of having the client pick them.
IMO it's also pretty weird to take the channel ID from the beginning of the passphrase. This takes away any entropy you'd otherwise get from the first word, since you're using a wordlist, which means the effective size of the secret used for PAKE is only about 1600^2. If you're okay with that, just use the entire word. It seems pretty low to me.
But yes I would like to see more work done to mitigate extremely easy DOS attacks, given that this seems like the most obvious vulnerability to these relay-managed PAKE approaches.
I'm confused about why you wrote "relay code shouldn't be written to only support <10 channels" as any three character combination is a channel?
You mention wormhole doesn't have a problem with colliding channels, but that it requires assigning a channel from the relay. To me this is a trade-off. If wormhole can't connect to a relay, wormhole can't assign a channel and won't work. Whereas, in croc, if you can't connect to the relay it will still work over LAN since the client chooses the channel.
I appreciate this discussion, there's a discussion on Github about this now. [1] Would you mind moving this discussion there?
You are wrong about a 99 channel limit. You can see that is not the case by passing the `--code` flag to wormhole and specifying a mailbox id larger than 100.
The receiver and sender find each other using the first three characters of the passphrase to establish a channel to communicate. Once in the channel they perform PAKE which will allow both parties to establish a secure key from the entire passphrase OR it will alert both parties that the passphrases differ and will break the channel (PAKE only works if the passphrase is the same, even if they made it to the same channel).
The sender and receiver know who the relay is beforehand. That is they either use the default relay (baked into the croc binary) or specify another relay.
Basically the relay server is a dumb switchboard. It waits for a connection to established with an identifier. When two connections with the same identifier are found, the relay server pipes the data between them.
Security is ensured by having each connection to the relay use PAKE. Once the two clients connected via the relay they establish their own communication securely using PAKE, so neither the relay (nor anyone else) can decipher the transfer.
If you use croc locally, you actually don't use the public relay as the two clients can automatically discover themselves with peer discovery. [1]
> does the developer maintain and fund it out of pocket? That seems to me to be quite costly.
Yes, I've maintain it and funded it out of pocket for a few years. Recently, I've gotten Github sponsor support that helps. [2] It's not too costly.
Here's an alternative to AirDrop! https://github.com/spieglt/FlyingCarpet Linux, Mac, and Windows, requires no network, just two laptops with wireless cards.
I use Syncthing for this kind of transfer involving non-Macs although that's because I'm always transferring between a set of known hosts, not ad-hocly. Works well enough.
I pine for the days of drop.io to return before it was acquired and dropped (BA-dum PSH!). That was, hands down, the best and easiest way to get files from one place to another.
I would like to see a security analysis. The code phrases seem ludicrously short to me …