Encryption is less secure than we thought

timtadh · on Aug 14, 2013

If you really want to understand this you should read the paper. However, I will try my best to summarize my understanding of what is going on. (Disclaimer: I am not a professional cryptographer, but I know something about crypto-complexity).

Many proofs in cryptography start with the following assumption: assume x is a bitstring of length N drawn from a uniformly random distribution of bit strings. This assumption is then used as the basis of the proof of the cryptographic construct.

For instance in a provably secure stream cipher you start with that assumption then proceed to show that if weak one-way-functions exist then you can construct a secure pseudo random number generator that is polynomially indistinguishable from a true random source. You then use your assumed random bit string, x, as the key to the psuedo random number generator to generate a one time pad to XOR with your message.

Since the secure psuedo random number generator is indistinguishable from a true random number generator (by a polynomial attacker) the message is secure (from a polynomial attacker). However, this is all predicated on the key being drawn from a uniform distribution. If the key is NOT drawn from a uniform distribution than there may be a potential speed-up available to the attacker.

What the paper shows is that often it is assumed that words drawn from an IID distribution of letters can be treated as if the distribution of words is uniform (given a sufficiently large word size). However, this distribution while very flat is not quite uniform. They show how this can be exploited using the "Guesswork" framework to come up with a tighter bound on the time it would take to guess a chosen word. This bound is much better than the traditional bound used.

What this means is that there needs to be more margin given than previously thought when dealing with keys derived from non-uniform distributions. Such keys include passwords.

anologwintermut · on Aug 14, 2013

The paper has nothing to do with the key being chosen at random(edit: in general, in the linked biometric paper it might have been the case) in the way you are suggesting. Not just are key's assumed to be chosen uniformly at random, in practice they actually are (either via software PRNGs or hardware).

The paper is talking about the distribution of plain-texts. As you say, that the IID distribution of letters cannot be treated as if the distribution of words is uniform

This is not a common assumption in cryptography(typically because we assume the attacker knows the real distribution of the plain-texts, the idea being they learn no more about the distribution given the ciphertext) and the papers cited that make this assumption are from sub-par conferences. So it would be more accurate to say some people made a mistake that is neither that common nor a core tenant of cryptography.

danielweber · on Aug 14, 2013

One of the most fascinating things from my crypto classes was that you can prove an algorithm secure if the keyspace is drawn from all possible keys, including a null key that returns the same ciphertext as plaintext.

If you alter your keyspace to exclude that one null key, the proof no longer works.

I still feel like sitting down and staring at the wall when I think about it.

lmkg · on Aug 14, 2013

That reminds me of the Enigma machines in WWII: The thing was set up such that a given letter in plaintext could not end up as that same letter in cyphertext. This was supposedly to add security, but it turned out not only to be a weakness, but one that was critical in breaking the code.

That one makes sense, in a way: If you see a S in cyphertext, you know the plaintext did not have an S in that position. That eliminates 4% of all possible plaintexts, and by extension 4% of all private keys. That's kind of a metric ass-ton of entropy you just lost right there, for every single character.

My guess is, your proof works by a qualitatively similar principle: Restricting the keyspace, no matter how trivially, gives the attacker information. And encryption is all about magnifying information.

MattJ100 · on Aug 14, 2013

I think you meant "could not end up as that same letter"?

maaku · on Aug 14, 2013

Usually that's an indication that the specific mathematical property that you're proving is not what we mean when we say 'secure'.

NhanH · on Aug 15, 2013

Yes, but in a good way, generally it's far stronger than our common sense idea of "secure".

maaku · on Aug 15, 2013

I meant quite the opposite, that if what the GP said is true, than the specific theorem being proved does not share the properties implied by the common sense idea of "secure," and is therefore insufficient.

NhanH · on Aug 15, 2013

That's not neccessary true. Take for example, in the case where the proof shows that resulting ciphertext would be indistinguishable from a random function, so the adversary would have 1/2^n (with n = key length, 128/192/256 etc, ) chance to decipher the text by guessing the key uniformly random. And he also has another 1/2^n chance to break the system, by us picking the null key (randomly). That would result in the chance of 1/2^(n-1)

On the other hand, the proof shows that leaving the null key out of sample space means the adversary would have 1/2^48 (arbitrary number) to distinguish the ciphertext from a random function, which means that it's "insecure" in the context of the proof.

It seems to me that this notion of secure would still be stronger than our common sense idea of "secure".

B-Con · on Aug 14, 2013

FTA:

> Traditionally, information-theoretic analyses of secure schemes have assumed that the source files are perfectly uniform. In practice, they rarely are, but they’re close enough that it appeared that the standard mathematical analyses still held.

After reading your comment, I'm confused. Are they talking about the entropy of the key or of the plaintext?

tveita · on Aug 14, 2013

> What the paper shows is that often it is assumed that words drawn from an IID distribution of letters can be treated as if the distribution of words is uniform

Is this assumption really that common? It seems like a strange assumption to make, since even biased coin flips with P(Head) = 0.99 would be independent and identically distributed.

timtadh · on Aug 14, 2013

Not being any sort of expert I don't know how important their result is. They cite a couple of papers in which use AEP (Asymptotic Equipartition Property) to treat words drawn from IID distributions of letters as uniform. The paper I looked at was in turn about brute force attacks. So essentially this paper is arguing that analysis of brute force attacks must necessarily account for the non-uniform structure of the source distribution.

On the surface it seems like this only really effects things which rely on treating these types of distributions. It is difficult for me to say how this effects the analysis of primitives when using keys derived from user input. In practice I don't think it makes much difference as we are already getting quite good at exploiting the distributions from which people seem to draw passwords. However, it may have some important implications for the theoretical side of cryptography.

CatMtKing · on Aug 14, 2013

That doesn't sound identically distributed to me... am I missing something?

thesteamboat · on Aug 14, 2013

It's not that the P(heads)=P(tails), but rather the coins are identical to each other. Each throw of the coin has the same probability and is independent of other throws of the coin.

CatMtKing · on Aug 14, 2013

What the paper shows is that often it is assumed that words drawn from an IID distribution of letters can be treated as if the distribution of words is uniform (given a sufficiently large word size). However, this distribution while very flat is not quite uniform. They show how this can be exploited using the "Guesswork" framework to come up with a tighter bound on the time it would take to guess a chosen word. This bound is much better than the traditional bound used.

By it being very flat but not uniform, what do you mean? I'm not clear on this point.

timtadh · on Aug 14, 2013

I mean that the distribution is very close to uniform. A uniform distribution is drawn a flat line when you graph it. (Inconstrast a normal distribution is a bell curve.) So when I say it is very flat I mean that it there is very little variation in the the pdf.

chacham15 · on Aug 14, 2013

I really dont understand the article nor your explanation. What does it matter if the plaintext isnt random? The csrng will hide any un-randomness.

> However, this is all predicated on the key being drawn from a uniform distribution

So, we're talking about the generation of the key? Are you trying to say that /dev/random isnt secure (i.e. the source when using algos like SSL)?

viraptor · on Aug 14, 2013

What does "a polynomial attacker" mean? I think I get the idea, but the term is not very well known by google.

chrismonsanto · on Aug 14, 2013

Generally in theoretical cryptography the adversary is a probabilistic Turing machine that can take K steps to break your encryption, where K is asymptotically bounded by a polynomial function of the length of the input it is trying to break.

You will see this called a 'PPT' in the literature, where PPT = Probabilistic Polynomial Time.

danielweber · on Aug 14, 2013

Very generally, if your string is N bits, a polynomial attack is one that can solve it in O(n^i) bits, where i is some fixed number. A non-polynomial time attacker would be something beyond that, like O(2^n).

Technically this treats O(n^2) and O(n^10000) as the same effort.

mtdewcmu · on Aug 14, 2013

I think you can loosely interpret polynomial as "practical." Worse than polynomial typically means exponential, which makes the problem too big to be practical to solve. A polynomial-sized problem is considered practical to solve.

signed0 · on Aug 14, 2013

TLDR: Bloch doubts that the failure of the uniformity assumption means that cryptographic systems in wide use today are fundamentally insecure. “My guess is that it will show that some of them are slightly less secure than we had hoped, but usually in the process, we’ll also figure out a way of patching them.”

conformal · on Aug 14, 2013

the article title is a total troll, thx mit.

for a real world example of failed uniformity assumptions, see cryptocat.

moocowduckquack · on Aug 14, 2013

How is "Less secure than we thought" a troll title, given that the article is about how one of the fundamental assumptions of cryptography is less secure than we thought? It doesn't say insecure.

pessimizer · on Aug 14, 2013

Maybe he's looking for "Possibly Encryption Might be Somewhat Less Secure Than We Personally May or May Not Have Thought At One Time, Technically."

moocowduckquack · on Aug 14, 2013

Hmm, seems a bit less snappy, somehow.

tjoff · on Aug 14, 2013

They make it sound like patching will alleviate the problem but patching is not a solution for a lot of use cases (where it is unfeasible to redo the encryption or when the NSA already has the ciphertext).

(not that "slightly less secure than we had hoped" is enough reason to panic.)

api · on Aug 14, 2013

I wonder about the practical effect of this in the Real World(tm).

When cryptographers talk about a "break," they mean anything faster than brute force search. So a break of a 128-bit algorithm that allows key recovery in 2^115 is a "break," but still completely impractical.

But as Bruce Schnier says: "attacks only get better." So a break is a good reason to look for a fix or a better algorithm now before the crack widens.

driverdan · on Aug 14, 2013

This is my question too. How much less secure does it make encryption? Enough to be breakable or not enough to matter?

Anderkent · on Aug 14, 2013

They quote three papers that supposedly make the mistake of using Shannon entropy to reason about secrets. They all seem to relate to the same algorithm, so it sounds to me (and our local crypto guy) more likely that there's one particular field that is making this mistake, but it's not a common mistake in crypto.

In my understanding Shannon entropy is not very useful for crypto reasoning for more trivial reasons. A password that has a half chance of being "foo", and half chance of being completely random has good Shannon entropy, but it's clearly not a good password.

JonnieCache · on Aug 14, 2013

Paper: http://arxiv.org/abs/1301.6356

mjn · on Aug 14, 2013

Thanks, that actually looks interesting.

The MIT press office has a habit of writing press releases that go beyond the usual hype to be so over-the-top they actually make the work sound crackpot. Happens most often in AI, but I guess in crypto now too. Not the scientists' fault, so I usually try to read the original paper to see their own claims, and invariably they are much more specific and legitimate.

epistasis · on Aug 14, 2013

I think it's not just MIT's press office, but every single one that I've seen. Or maybe there's a selection effect; the more sensational the particular article, the more people see it, so it just seems that all press offices are off their rocker.

JonnieCache · on Aug 14, 2013

http://www.phdcomics.com/comics/archive/phd051809s.gif

andreif · on Aug 14, 2013

This would fit better as a comment for the original submission

mturmon · on Aug 14, 2013

Agreed. I have mostly stopped reading MIT Technology Review articles for this reason, and this article showed the same problem with hype and lack of context.

tveita · on Aug 14, 2013

> We establish that for source words originally constructed from an i.i.d. sequence of letters, as a function of word length it is exponentially easier to guess a word conditioned to be in the source’s typical set in comparison to the corresponding equipartition approximation.

Maybe I'm missing something, but this seems like a known and fairly trivial result.

I read that as basically saying that if you don't pick the letters in your passwords from an uniform distribution, they are easier to guess.

I don't see what that adds when people are already using Markov chains and dictionary attacks to brute force passwords, and I see no implication at all for secure keys chosen by good PRNGs.

I guess it could have some use in entropy estimation for PRNG state, but assuming uniform distribution of input seems like a naive mistake to make. Is there some application of this that I'm missing?

Tloewald · on Aug 14, 2013

If your input is "perfectly" random then the codebreaker has no chance of determining when he/she has successfully decrypted it (regardless of how weak your encryption is). So the more random your unencrypted data, the harder it is to decrypt it once it is encrypted. It follows that less random data will be easier to decrypt. Now I can't prove the shape of the intermediate function (e.g. might there be a sweet spot in the middle? "Common sense" suggests the relationship would be monotonic, but common sense is often wrong).

Incidentally, it's also clear that if the encryption schema is large relative to the compressed data then it will also be harder to decrypt (of course if the encryption schema is public, e.g. a public key, then this is decidedly not true).

So, none of this should be terribly surprising. What it does suggest is that you should compress data before (or while) encrypting it, and the better your compression algorithm, the more entropy there will be in your data, and the more secure your encryption.

nialo · on Aug 14, 2013

This is the obvious intuitive response, but it turns out that it's broken: Compressing data before encryption can give attackers a new side channel to recover the plaintext, by just watching the message length, and seeing when the compression is more or less effective. See, for example, the CRIME attack on TLS.

Tloewald · on Aug 15, 2013

Seems to me that the inherent problem here was that you could get the sender to send messages of your choosing, which is always going to be a vulnerability. Consider that this entire line of discussion is redundant if (a) the encryption key is public (as stated above) or (b) you know what the plaintext looks like for some encrypted examples.

p0ckets · on Aug 14, 2013

Obvious next step: compress and pad to original length.

rmidthun · on Aug 14, 2013

Obvious? Didn't that word already cause enough trouble?

What are you going to pad it with? Random values? In that case, I'll just make two requests for the same information and clip it where they differ. Some sort of value based on the message? That also has issues, see link below. So far it doesn't seem like there are any obvious crypto ideas that are not also wrong.

http://en.wikipedia.org/wiki/Padding_oracle_attack

nknighthb · on Aug 15, 2013

> I'll just make two requests for the same information and clip it where they differ.

So, byte 0? You're not really using ECB, are you?

rmidthun · on Aug 15, 2013

D'oh. Good point.

At least I can point to it as another example of someone not expert in the field making a stupid statement about crypto...

I think the general problem of padding with random values resulting in non-determinism would still matter in some cases, but IANAC.

CWuestefeld · on Aug 14, 2013

Ah, but if you compress it into, say, a ZIP file, then an attacker would be able to look for the characteristics that are typical of a ZIP file. The opening bytes, for example, are completely predictable. This would be just like the Nazi mistake of always ending messages in "Heil Hitler".

parfe · on Aug 14, 2013

>This would be just like the Nazi mistake of always ending messages in "Heil Hitler".

Got any source for that? Closest I can find is a Polish cryptologist facetiously adding "Heil Hitler!" to ENIGMA encrypted messages he sent to Britain.

https://en.wikipedia.org/wiki/Marian_Rejewski

Tloewald · on Aug 15, 2013

I believe the actual issue was that messages started with very predictable greetings.

"we relied on the fact that the greater number of messages began with the letters ANX—German for "to", followed by X as a spacer"

http://en.wikipedia.org/wiki/Cryptanalysis_of_the_Enigma

Compressed files (indeed ANY files) having predictable headers is, of course, a major problem, and the argument that you could just omit the common stuff won't work very well either. (Let's suppose you leave off all the static parts until you get to a length stamp -- lengths are a classic example of not-very-random numbers (more significant digits are prone to be lower-valued).

In general, you want your cryptographic system to be reasonably foolproof when used by bored, careless people. For this reason rolling compression and encryption into a single implementation is probably the best option.

eru · on Aug 14, 2013

Just cut out the common parts.

stcredzero · on Aug 14, 2013

From what I've seen, watching security and cryptography from the sidelines for the past 18 years, is that we programmers are still in the myopia of the early days of technology.

It took decades for programmers to come to the awareness of the importance of human factors and user interfaces to reach today's levels. (Really, how useful is computing, if only a very few want to use it?) Yet we're really just beginning to truly get it as a field. (Or as a "not-quite-a-field," to give a nod to Alan Kay.)

It's probably going to take just as long for us to figure out how to account for human factors and economics when it comes to security. Yet, if you are really savvy about security, it should leap out at you that human factors and economics are just about the biggest two determining factors in security.

The big problem is not what algorithm you use to encrypt A or B. It's the employee's and programmer's lack of knowledge about security and key management. It's the secretary who clicks on the phishing link. It's management's misconception that a dozen of BigCo's employees with insufficient peer review can take on the reverse engineering resources of the entire Internet.

The big problems in security have to do with organizational awareness and economics. If you're not accounting for those, you are misallocating your resources, probably by an order of magnitude.

Tichy · on Aug 14, 2013

The things the mention sound to me just like the standard things cryptographer worry about on a daily basis. I am not sure there is anything new there.

BetaCygni · on Aug 14, 2013

Isn't this just a variant of the known plaintext attack?

VLM · on Aug 14, 2013

Maybe a good analogy is if you take a "giant" rainbow table, you can usually compress the unencrypted side of the table to something quite a bit smaller than you'd think.

Maybe another way to phrase it is something like encrypting plain text english is now mathematically proven to be not as secure as encrypting a zip (or other format) compressed version of the same english text. (edited to add that its early in the morning and the specific example of a zip file is a great known plaintext attack 0x04034b50 and all that, but "compressing in general" is a good idea. The mistake I made was like calling all photocopiers, "Xerox machines", or all nose tissues "Kleenex").

Crypto is very well taught for algorithms, classes of algos, history, couple other things, but the classes of apps of crypto technology is (intentionally?) very poorly documented and taught. Something that screws up screws up digital sigs probably has little to do with the issues in encryption or authentication or whatever. Its an area ripe for someone to write a book, or a startup to do "something" probably something educational.

api · on Aug 14, 2013

But be careful when compressing. Never compress something secret together with something that an attacker can repeatedly influence, or you're possibly vulnerable to compressed length oracle attacks similar to CRIME.

rst · on Aug 14, 2013

BREACH is a variation on the theme (compression at a different protocol layer, but again pre-encryption). We're going to see more of these.

__alexs · on Aug 14, 2013

> "compressing in general" is a good idea

Unless there is any opportunity for the attacker to control part of the plaintext?

tehwalrus · on Aug 14, 2013

no, it means knowing the algorithm allows you to infer more information from the encrypted data, because the "real" entropy (as opposed to the apparently now defunct Shannon entropy) will be much smaller than you expect (you have to try less brute force guesses.)

Shorel · on Aug 14, 2013

This is just theory getting up to speed with current practice.

Password breakers (when having the hash result) do not assume that the passwords are random strings, they use several assumptions that greatly reduce the time they need to break the passwords.

tehwalrus · on Aug 14, 2013

This is an interesting and important result. I was fascinated by the Shannon entropy when I first heard of it in undergrad, and I'm now very tempted to go read all the other entropys people came up with too :)

Andome · on Aug 14, 2013

Well, the guessing probability is the logarithm of the min-entropy, not the Shannon entropy. As for Shannon entropy, it's impossible to get SE with our current code generation because there will always be a small difference from theoretical limit. When the length of our codes get bigger the Shannon entropy/uniformly distribution model breaks down and the Guesswork time isn't as long as we once thought.

MichaelMoser123 · on Aug 15, 2013

You can send a key for a pseudo random number generator in the start of the message, use the key to seed it, the then use the output of the pseudo random number generator and xor it with each byte of the message. that would be some kind of 'salt' that would make the message more redundant.

MichaelMoser123 · on Aug 17, 2013

I meant sending the seed key the initial part of the encrypted message; all further data would be retrieved after decryption by XORing the PRNG output.

Retric · on Aug 14, 2013

When they say the same number of zeros and ones at maximum enthropy that's not actually true. The obvious case being a single bit file or even just an odd number of bits. Instead there is no bias in the file so each bit could just as likely be 1 or 0 which prevents efective compression.

skriticos2 · on Aug 14, 2013

I'd say a single bit file is an exception. If you want to encrypt a single bit, I suggest you use a onetime pad and xor it. Can't get smaller, more secure or simple.

Retric · on Aug 14, 2013

Actually what you do for most algorithms is to add random padding which is then removided after decription.

sp332 · on Aug 14, 2013

You need an equal number of 0's and 1's, and you can use compressibility to see why. If there are more 1's than 0's, your compression tool could start by assuming that all the bits are 1's, and only storing information about where the 0's are. To minimize compressibility, you have to make sure there's no bias toward certain patterns, even patterns of length 1.

robrenaud · on Aug 14, 2013

You need every bit string to be maximally unpredictable a priori. EG, every bit string of length n is equally likely. This is not the same as having exactly equal numbers of 0s and 1s. That doesn't really even make sense, because with the constraint as mentioned, you can't ever transmit an odd length string. Also, if you are transmitting a bit string of length 2, knowing the first bit guarantees you know the second, which is low entropy (you only have one bit of entropy in a 2 bit message).

On the other hand, it is the case that most long bit strings sampled from a maximally entropic source have roughly the same number of 0s and 1s.

dlubarov · on Aug 14, 2013

Let's say the string contains 100 0s and 101 1s. How would you encode the positions of the 0s without using more than 201 bits in total?

Nearly all random strings of sufficient length have different numbers of 0s and 1s. So if your argument were true, your compression scheme would reduce the average length of completely random strings, which is impossible.

Retric · on Aug 14, 2013

Use 200 bit's to encode the message except the last bit which is determined if the message had 100 1's or 101 1's. QED knowing the number of 1's buy's you at least one bit of information.

Edit: You can do better than this by counting the number of 201 bit messages with 100 0's which is well below 2^201 and then encoding which of those corresponds to the original message. Aka if you had an 8 bit message with one zero 11111101 you have a total of 8 options which means you can encode it as 3 bits.

robrenaud · on Aug 14, 2013

Even knowing the exact number of 1s and 0s buys you a few bits of information.

log(201 choose 100) / log(2) = 196.843

Order the 201 choose 100 strings on 201 bits with exactly 100 0s lexicographically, write down the index in that list of your particular message in 197 bits in a completely ordinary fashion ;P.

But to be clear, I don't disagree with what you mean at all.

raverbashing · on Aug 14, 2013

A single bit file has either one 0 or one 1, or am I missing something?

may · on Aug 14, 2013

I'd be curious on cperciva's take on this.

plg · on Aug 14, 2013

I bet non-compliance with best practices accounts for 10x the amount of "less secure" than anything else. (Like in birth control)