I have failed time and again to grasp how exactly Dat works. IPFS is easy for me...

rakoo · on April 19, 2020

Bittorrent is immutable, Dat is the mutable version of Bittorrent.

You create an archive, that is identified with a cryptographic pubkey. You add files in it, dat stores some metadata in it. You give the archive's id to a friend, who starts retrieving the metadata, and can then see there are files; he can download files as he pleases. Syncing can also be realtime, so he gets the new content as soon as you put it.

Only you, the holder of the cryptographic privkey, can add content to the archive. Crypto is used to sign all content, so there's no doubt it was legitimately written by you. Since the id doesn't move, it is possible for multiple peers to inter-connect and exchange data as needed in a swarm fashion, even if you're offline

stavros · on April 19, 2020

That's a great explanation, thank you. How does this handle versioning? If I add one file to a vast dataset, can people download just that file if they already have the previous version? IPFS hashes at the chunk level, so even if you append something to a large file, you can download just the new chunk and be up to date.

rakoo · on April 19, 2020

Yes, in fact dat was initially created for this use case: data analysts want to exchange their data, and that data evolves in time so it needs to be transported "efficiently", as in, you don't need to redownload a full .zip just for a single file change. The only metadata you'll receive will concern the new file and the old content is still valid. You can even seek inside any file if you don't want to download the whole file.

Changes _inside_ a file, though, are not handled. Today if a file is modified dat will consider any bytes of the old one to be garbage and will not reuse it.

dat is a sexy frontend on top of hyperdrive(https://github.com/mafintosh/hyperdrive), I personally think it's easier to see what dat can do by looking at what hyperdrive does

stavros · on April 19, 2020

Thanks, that's very informative, and the intro in the Hyperdrive README clarifies the goals very well. I have a much better idea now, thanks again.

mattlondon · on April 19, 2020

I think the difference here is that the URL in dat is not a guarantee of immutability - it is just a public key and the author of the dat is free to update the content at any time without changing the key (there is a change history though).

I am not an expert on this - just a casual passing interest so might have it wrong.

This explains more: https://datprotocol.github.io/how-dat-works/

stavros · on April 19, 2020

Oh hmm, thanks. I've seen that link, but I'm not interested in getting into the byte-level, I just want a high-level overview. A URL being a public key that you can update if you have the private key makes sense, thank you. Do you know how propagation/distribution happens? With IPFS, whoever accesses your content serves it for some time.

mattlondon · on April 19, 2020

I believe the distribution is largely the same concept, i.e. you get a copy of the data from somewhere, then other nodes that are also looking for the same data might discover that you have it as well and so you might then serve it on to those other nodes too.