Hacker News new | past | comments | ask | show | jobs | submit login
ZeroBin, opensource Pastebin where the server has zero knowledge of pasted data (sebsauvage.net)
218 points by alixmartineau on April 12, 2012 | hide | past | favorite | 71 comments



The genius of this is the realization that browsers do not send the named anchor (technically "fragment identifier"[1]) to the server. Using the named anchor as the cryptographic key enables users to pass around simple URLs to encrypted data. Data is stored on the server, but the server never has access to the complete URL with the key, so it cannot decrypt it.

As others have pointed out, this doesn't protect the data from a compromised server, but I think it has a different motivation.

It appears the purpose of this is to reduce the liability of whoever is running the server. Perfect for magnet links and such.

This is another step in the right direction of protecting the web and its maintainers from legislation.

[1] http://www.w3.org/TR/html401/intro/intro.html#fragment-uri


> It appears the purpose of this is to reduce the liability of whoever is running the server. Perfect for magnet links and such.

PasteBin itself is a DMCA notice magnet already for some very litigious people, as I know from having read through more public DMCA notices than most people would bother reading through. So if you're worried about legal liability, consult a lawyer to make sure that your technical solution would actually help you in court.

There's something called "willful blindness" that you might want to understand if you plan to run a PasteBin clone where people could be expected to post magnet links to pirate content. You could also do worse than to have a lawyer explain your DMCA obligations to you, too. If people start posting pirated stuff to your site, you're going to want to be very clear about them, lest you find out that you technically don't have DMCA safe harbor because you flubbed one of the requirements.


Sort of off-topic, but on the subject of using fragment identifiers to pass around secrets, Ben Adida used them to secure session cookies from eavesdroppers:

https://github.com/benadida/sessionlock

http://www2008.org/papers/pdf/p517-adida.pdf

He uses a token in the fragment identifier to authenticate every request; since the fragment identifier never gets sent to the server, a passive attacker never sees it -- it's a nice trick.


Interesting... but bafflingly, doesn't the URL shortener service they provide totally defeat this?

http://snipurl.com/230jiso

They allow you to shorten the URL by using another service. But now snipurl.com has your URL fragment and can read your stuff!


True, but this isn't where you are going to store your credit card information. My guess is that this is a defense against those who want to control the Internet through legislation.

Imagine a world where SOPA had passed, and everyone who ran a website was legally responsible for everything that their users did.

In that scenario, one way for website operators to protect themselves is to make it impossible to know what their users are doing.

The 3rd party URL shorting service is not storing or associated with the encrypted data and is also not responsible for it.

So this may not be about private data so much as it is about protecting the freedom of information.


You can't hack the law. All the legislators have to do is to make it mandatory for the site owners to be able to search through site contents.


It would be pretty easy to build a tool to route around that type of law. I create a service that encrypts blobs of data (encryptmyblo.bs), and uses a (publicly defined somewhere) postMessage API to communicate that blob to and from a key value store provided by a third party. I have my friend set up a key value store service provider that stores and retrieves blobs (storemyblo.bs), including the facility to search for stored blobs of data based on binary strings.

Because the services exchange those KVPs via a postMessage API, neither service is actually communicating with one another directly or have a formal association. The user is effectively (via the browser) moving the data from one service to another. Since EncryptMyBlo.bs doesn't store any data, and StoreMyBlo.bs doesn't have visibility into the data, neither service would be in violation of those requirements.


It would also be pretty easy to make encryption illegal.


No, because the goal isn't to protect the user's data from being "leaked", it's to protect the user from the hosting site (zerobin) being forced to take down their posts.


Quoth the project page; "Admins can still remove a document upon injunction or infringement notice… but have no way to tell if the same document has been posted again."


So what happens when Zerobin or SnipURL is ordered to take down "all posts from the same client IP address as the post with the shortened URL of ..."?


Don't log IPs? End user IP's are mostly dynamic anyway.


And run the entire server out of memory. 64GB of ram is cheap on servers now; you boot from a write-protected flash drive, and everything is done in memory. If you power the box down, anything stored in ram is lost.


> If you power the box down, anything stored in ram is lost.

That's the theory of ideal RAM, but in practice RAM is not ideally volatile. Cf.: http://en.wikipedia.org/wiki/Cold_boot_attack


I believe that in almost all cases, the LEO tasked with seizing equipment in an operation are going to be ill-equipped to execute this attack. Now, if its the CIA or NSA after you, you have other problems.


SnipURL doesn't seem to make that claim. They even load https://www.paypal.com/en_US/i/scr/pixel.gif which has a tracking cookie.


It's not really true that (due to the data being encrypted client side) your data is safe even if someone were to gain control of the server. It's something often claimed by these "Host Proof" style services.

As long as you are downloading the client side code from the server, someone just needs to make a small change to the javascript and they get access to your data.

Only if you can trust the code, and then make certain that the code does not change, is your data truly safe from the host.

Cortesi (http://corte.si) has done some interesting writing on the subject. http://corte.si/posts/security/hostproof.html

He implemented a similar service (http://cryp.sr + http://corte.si/posts/security/crypsr.html ), and also worked on creating a browser addon that would verify a webpage against a known hash (https://github.com/cortesi/apphash).

Here are a couple of HN submissions on cryp.sr which have some discussions on the "Host Proof" concept:

https://news.ycombinator.com/item?id=1438552

https://news.ycombinator.com/item?id=1226277


It not about user security. You can encrypt your data with gpg and send it via pastebin, zerobin, or mailinator to your accomplice and secure. ZeroBin is about lifting the liability off the service provider.

Think the tired old "does linking constitute a copyright violation?" question. If someone posts a list of magnet links or torrent URLs to ZeroBin, they can claim plausible deniability: they cannot know the content of the pastes they host.

This reduces to deep questions about internet freedom. For example, if ZeroBin couldn't for some legal reason host "infringing" links, they couldn't host any links, and Tor, I2P, Freenet, and other onion/mix networks would be similarly illegal. Or can it be illegal to participate in an anonymous networks or not? Some parties would like to see the answer to be "yes". But we all know that there's copying bits and there's copying bits: at some point it doesn't matter, at some points bits can't be legal or illegal but just bits.

Technology like ZeroBin can help drive the liability question down to the very, very basics where it can be eventually solved.


They actually mention this in the "drawbacks" section of the project page:

>Won't protect against Man-in-the-middle attacks (eg. javascript substitution)

I think it is pretty much the case that anything that works as man-in-the-middle will also work if an endpoint is compromised.


If you did not trust the supplied server, you can host the javascript (including the support libraries) on a server you control (and, theoretically, can trust) by making your own web site. I would expect it to be pretty simple to store the data on your server or on a third party server, e.g. S3.


It's a pedantic note, but I agree they should clarify on their project page that in the event of server compromise only your past data is secure and only if no one accesses past data post-compromise. As usual, encrypt offline with your own unshared private key if you really want it to stay private. I agree with the comment at the top though that the primary utility is the legal predicament it presents.


Computers with dotjs or greasemonkey could also sneakily get your stuff, being all the compromising code invisible in network traffic and even inspecting the page source. Something to consider if using an untrusted computer (and not just keyloggers and the like).


Would browser-hosted JS solve the problem?


Not if the HTML hosting any of the JS is under attacker control.


Can you solve this with NoScript applied only on server side code? Or is there some clever way to bypass that too?


Hm. I ran this with Fiddler running; and I saw exactly what they claimed. I received a cypher text and the client rendered it.

More importantly, I noticed another query before it: GET /qsml.aspx?query=http%3A%2F%2Fsebsauvage.net%2Fpaste%2F%3F9f9ee11adc3a2093%2312WGK1zDE5Nqpz8mwVa%2BA%2BQQ8%2F12zJqHb5935uRvWdw

Bing was searching for my link on the internet in case it wasn't a URL. Knowing this, couldn't someone demand Google give them someone's search history to sebsauvage.net; and, if whole URIs are returned, the anchor tag will also be provided?

This doesn't seem completely private. The webserver may not know what's in your text; but your search providers will; or will have the knowledge to be able to.


i made a chatroom implementation on exactly the same principal a few years ago using long polling. the password to the chatroom served as the client-side encryption/decryption key, the encryption algo was AES and a random IV per word.

here is the algo i was using for the client-side enc/dec.

http://www.myersdaily.org/joseph/javascript/alphac.html


Sorry dude, your home-brew encryption scheme (not AES) is trivially breakable.

All messages with the same key are XORed with the same stream of data, given more than one message encrypted with the same key that is trivially reversible - see any cryptography text book for details. Your "seed" doesn't help because you include it in the message...


heh, well it was a prototype. nothing in production.

the crypto scheme was randomized between 3 algos to reduce the statistical data size for each. one was AES, one was alphacrypt, another one was something else. the json protocol exchanged an algo_id which was stored with each message.

as for home-brew, i used the encrypt/decrypt code i found at the provided location. i'm no crypto expert and the guy claims to have a mathematics PhD with "Eleven years of publishing scientific and technical papers (computer science and higher-level mathematics)", so hopefully his security assertions are not entirely without merit.

the key is not stored anywhere. it must be exchanged by other secure means, just like any symmetric encryption/AES.

if you encrypt "a" using a key "foo" and the lib selects a random seed char to tack onto every word, there can (from what i understand) be 255 variations to encode the same plaintext word...and this set varies for each key.

storing the seed, like storing a salt doesn't seem like it would help much. so it's not immediately obvious that having access to many messages encrypted using the same key would allow you to do any kind of trivial statistical analysis other than on word length alone.

if it is in fact trivially breakable, i would love to see it implemented by simply having access to the encryption code and ciphertext of 5 different messages of several words in length encoded using the same key 100 times each.

it'd be awesome to learn more about crypto....and about 100 other topics, too :)


First of all the "seed" is bupkis because it's not actually used in the encryption - it's just used to xor the ciphertext, and then sent along with it. You can trivially reverse that step and then get on attacking the rest of it.

After that, the attack is elementary:

You have two plaintexts, A and B. These are both xored with the same keystream X, so the attacker has access to both A ^ X and B ^ X.

Xoring both cipherstreams gives you A ^ X ^ B ^ X, which is equivalent to A ^ B.

Do you see the problem yet, or do I need to continue?

EDIT: The real point is if you're not a crypto expert, don't write crypto code. (If you are a crypto expert, you already know better than to try writing your own crypto code). Also, don't trust obscure web pages that present crypto implementations - use a widely-known (and hence widely-vetted) implementation instead.


maybe will just stick with the AES, then.

thanks for the info!


"The Code Book" by Simon Singh is awesome reading. It certainly explains how and why naive schemes like this are easy to crack.

http://amzn.com/0385495323


I was expecting this as soon as I saw that the CEO of Pastebin (or whoever he was, I don't remember) said that they will put a stop to posts from Anonymous and such.


The problem is always that we have to TRUST the server's claims about all this. I think the web should be enhanced with httpc

http://news.ycombinator.com/item?id=2024164


You could save a local copy of the html and javascript to make sure you're using the same code every time. You'd probably have to make a few tiny changes (absolute/relative URLs, etc), though.


Or you could just use PGP.


How would PGP solve the problem of knowing that a resource on the internet didn't change? Doesn't someone have to verify this, which is the point of httpc?


I don't see ZeroBin fulfilling any useful purpose for the following two reasons:

1) The encryption and decryption features do not provide data confidentiality to users because the encryption/decryption environment is not controlled by the end users. The host retains complete control over the algorithms used and the input and handling of keys and plaintext. Users would need to verify that the client-side environment is safe to use on every single page load.

2) The host will still be legally required to block access to encrypted messages once someone sends them a URL containing the key. Furthermore, the host will still be liable to produce log files identifying users. Extensive case law exists in multiple countries to support these two statements.

Similar comments were made in the HN discussion of CrypTweet[1].

[1] https://news.ycombinator.com/item?id=3611453


So it's basicly https://ezcrypt.it/ with longer urls, a url shortener which breaks any measure of confidentiality, a less polished UI (which is kind of sad since it is basicly a rip anyway), and broken SSL, which, for all it's faults, will still help to protect the integrity of the javascript which is critical to the security. yaaaaawn


But ezcrypt.it's source doesn't seem to be released, which would be nice.



Hi, I've noticed that ezcrypt has started out started out on github but hasn't gotten past the first readme. Can anything be done to help it be fully published?


Interesting. This is similar to an idea I've had for a privacy-sensitive version of Facebook ("Faceless"?) that uses public key encryption at the client to prevent the servers from being able to read anything posted.

When you post new content to your friend list, your client uses your private key & your friends' public keys to post a separate encrypted copy of the content for each friend. Thus only they (and you) can read the content.

Note before people point out all the flaws in this idea =)... I'm aware that this would have problems on the key management side of things, plus one-encrypted-post-per-friend would get computationally (and bandwidth) intensive as your friend list (or "circle") increased in size. I'm also not convinced people actually want this - Facebook has made it abundantly clear that most people don't care about online privacy.

I would use it, though!


So, the same thing as https://ezcrypt.it/ ?


Not exactly unless I missed their project page where they released their source. I'd be very glad if it existed somewhere.


I came here to post this.


This is so ingenious that it baffles me how no one thought of it before.

Needs an alternative way of unlocking. A short link and then a passphrase generating key.


Oh, but someone did think of it before. https://ezcrypt.it


Interesting project. If you're the developer - I would like to see the ability to view the encrypted message without viewing the page source.


I'm not sure that is true. If the link automatically decrypts the paste then you could use server logs to get the plaintext of the document.


From my understanding, the link contains 2 parts -- paste ID, and decryption key _following_ a "#". Assuming the latter isn't going into the server logs (I believe the fragment identifier isn't sent in headers at all, unless there's JS on the page to tell the server about it), the actual decryption seems to be taking place via javascript (as well as the encryption to begin with), and therefore the encryption key has no reason to be sent to the server at any point.


Someone didn't read the project page ...

See the section "When opening a ZeroBin URL: "


see: http://sebsauvage.net/wiki/doku.php?id=php:zerobin

the "pasting" and "opening" sections cover this


I'd like to see some improvements like making the ciphertext directly readable (and writeable) so that we can copy and paste the text into (and out of) true client-side decryption that doesn't rely on code served from the same server. This is one of those cases where it's highly secure in theory, but in practice may not be secure if the server is compromised.


This is interesting!

It really is true that if you have an idea there is already someone also working on it. I have been putting together something very similar and I'm just days away from launching it. I just set up the server last night. I will have to get it up ASAP now. Fortunately its a little different to this but built on a very similar concept.


"kittens will die if you abuse this service" means "we have no reasonable method to stop you abusing this service".


They probably have the same methods as any other pastebin: block IPs which send an unreasonable quantity of data.


Projects like this scare me considering the issues PasteBin has in regards to sensitive data being shared illegally; and that was without encryption. I'm foreseeing legal hiccups and requirements for policing the content.


Real criminals can:

- Host such data off of a server bought with a stolen credit card and accessed via anonymous proxies.

- Use any of the normal pastebin sides out there with offline encryption methods. (E.g. use PGP + copy/paste encrypted text into browser)

- Host such data off of a hacked server.

I don't really see how ZeroBin is scary. You could even do something like this: https://dgl.cx/wikipedia-dns with a hacked domain account + a hacked server. I'm sure the amount of traffic could be small enough for the hack to remain hidden for a long time.


Interesting. I'll still stick with KDE paste for their api, but seems promising


How is the service able to verify the key? In the screenshots it says "Could not decrypt data (Wrong key ?)"

Shouldn't a wrong key decrypt the text to just unreadable garbage?


Very clever.....now do it for files :D


would there be any use for a "n of m" version? you'd somehow generate m urls, and any n of them would allow you to recover the secret. i guess instead of urls they would be separate parameters, but otherwise it should be doable, right? can't think of any uses, though...


This would be easy to implement via a secret sharing scheme, where shares of the secret are maybe stored in local storage until enough of them are collected in the browser to recover the encryption key. But I can't really think of a compelling use case either.


I'm not sure how this is any different than a pastebin service over https. The data is encrypted in teh browser meaning that someone won't be able to snoop (if that's a concern for some reason) but there is nothing stopping the server admins from seeing your data as long as the server is storing the encrypted data and the decryption keys. Am I missing something?


you are missing something, read their project page which states:

"The key is never transmitted to the server, which therefore cannot decrypt data."

but it does seem like the resulting queries could be stored in the servers logs, as the key needs to be part of the request? edit: no it doesn't, I needed to read more :) the key is the anchor part of the URL. neat!


The server doesn't store the decryption keys. However, the server does serve the crypto code which makes any perceived security boundary between the server and client bogus.

Cool trick though.


Awesome. I needed a nice pastebin for our intranet :)


Source code not available yet apparently.


Imma gonna paste my data where nobody can see. Imma gonna paste my data where it can be truly free. Imma gonna paste my data in grandma's apple tree.

Imma gonna clear all my caches and de-bank my cash. Imma gonna get a plane ticket to the Isle of San Mosh. Imma gonna get away from The Man, away from all this tosh.

Imma gonna paste my data where nobody can see. Imma gonna paste my data where it can be truly free. Imma gonna paste my data in grandma's apple tree.


Phrases like "Kittens will die if you abuse this service" make me uncomfortable. Just replace it with "Privacy & Terms comming soon".

I continue to read unprofessional things like that in the startup environment. I'd say leave the humor to your customers. Just my opinion. People always talk about "scaling". A phrase like that does not.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: