If you really want to understand this you should read the paper. However, I will...

anologwintermut · on Aug 14, 2013

The paper has nothing to do with the key being chosen at random(edit: in general, in the linked biometric paper it might have been the case) in the way you are suggesting. Not just are key's assumed to be chosen uniformly at random, in practice they actually are (either via software PRNGs or hardware).

The paper is talking about the distribution of plain-texts. As you say, that the IID distribution of letters cannot be treated as if the distribution of words is uniform

This is not a common assumption in cryptography(typically because we assume the attacker knows the real distribution of the plain-texts, the idea being they learn no more about the distribution given the ciphertext) and the papers cited that make this assumption are from sub-par conferences. So it would be more accurate to say some people made a mistake that is neither that common nor a core tenant of cryptography.

danielweber · on Aug 14, 2013

One of the most fascinating things from my crypto classes was that you can prove an algorithm secure if the keyspace is drawn from all possible keys, including a null key that returns the same ciphertext as plaintext.

If you alter your keyspace to exclude that one null key, the proof no longer works.

I still feel like sitting down and staring at the wall when I think about it.

lmkg · on Aug 14, 2013

That reminds me of the Enigma machines in WWII: The thing was set up such that a given letter in plaintext could not end up as that same letter in cyphertext. This was supposedly to add security, but it turned out not only to be a weakness, but one that was critical in breaking the code.

That one makes sense, in a way: If you see a S in cyphertext, you know the plaintext did not have an S in that position. That eliminates 4% of all possible plaintexts, and by extension 4% of all private keys. That's kind of a metric ass-ton of entropy you just lost right there, for every single character.

My guess is, your proof works by a qualitatively similar principle: Restricting the keyspace, no matter how trivially, gives the attacker information. And encryption is all about magnifying information.

MattJ100 · on Aug 14, 2013

I think you meant "could not end up as that same letter"?

maaku · on Aug 14, 2013

Usually that's an indication that the specific mathematical property that you're proving is not what we mean when we say 'secure'.

NhanH · on Aug 15, 2013

Yes, but in a good way, generally it's far stronger than our common sense idea of "secure".

maaku · on Aug 15, 2013

I meant quite the opposite, that if what the GP said is true, than the specific theorem being proved does not share the properties implied by the common sense idea of "secure," and is therefore insufficient.

NhanH · on Aug 15, 2013

That's not neccessary true. Take for example, in the case where the proof shows that resulting ciphertext would be indistinguishable from a random function, so the adversary would have 1/2^n (with n = key length, 128/192/256 etc, ) chance to decipher the text by guessing the key uniformly random. And he also has another 1/2^n chance to break the system, by us picking the null key (randomly). That would result in the chance of 1/2^(n-1)

On the other hand, the proof shows that leaving the null key out of sample space means the adversary would have 1/2^48 (arbitrary number) to distinguish the ciphertext from a random function, which means that it's "insecure" in the context of the proof.

It seems to me that this notion of secure would still be stronger than our common sense idea of "secure".

B-Con · on Aug 14, 2013

FTA:

> Traditionally, information-theoretic analyses of secure schemes have assumed that the source files are perfectly uniform. In practice, they rarely are, but they’re close enough that it appeared that the standard mathematical analyses still held.

After reading your comment, I'm confused. Are they talking about the entropy of the key or of the plaintext?

tveita · on Aug 14, 2013

> What the paper shows is that often it is assumed that words drawn from an IID distribution of letters can be treated as if the distribution of words is uniform

Is this assumption really that common? It seems like a strange assumption to make, since even biased coin flips with P(Head) = 0.99 would be independent and identically distributed.

timtadh · on Aug 14, 2013

Not being any sort of expert I don't know how important their result is. They cite a couple of papers in which use AEP (Asymptotic Equipartition Property) to treat words drawn from IID distributions of letters as uniform. The paper I looked at was in turn about brute force attacks. So essentially this paper is arguing that analysis of brute force attacks must necessarily account for the non-uniform structure of the source distribution.

On the surface it seems like this only really effects things which rely on treating these types of distributions. It is difficult for me to say how this effects the analysis of primitives when using keys derived from user input. In practice I don't think it makes much difference as we are already getting quite good at exploiting the distributions from which people seem to draw passwords. However, it may have some important implications for the theoretical side of cryptography.

CatMtKing · on Aug 14, 2013

That doesn't sound identically distributed to me... am I missing something?

thesteamboat · on Aug 14, 2013

It's not that the P(heads)=P(tails), but rather the coins are identical to each other. Each throw of the coin has the same probability and is independent of other throws of the coin.

CatMtKing · on Aug 14, 2013

What the paper shows is that often it is assumed that words drawn from an IID distribution of letters can be treated as if the distribution of words is uniform (given a sufficiently large word size). However, this distribution while very flat is not quite uniform. They show how this can be exploited using the "Guesswork" framework to come up with a tighter bound on the time it would take to guess a chosen word. This bound is much better than the traditional bound used.

By it being very flat but not uniform, what do you mean? I'm not clear on this point.

timtadh · on Aug 14, 2013

I mean that the distribution is very close to uniform. A uniform distribution is drawn a flat line when you graph it. (Inconstrast a normal distribution is a bell curve.) So when I say it is very flat I mean that it there is very little variation in the the pdf.

chacham15 · on Aug 14, 2013

I really dont understand the article nor your explanation. What does it matter if the plaintext isnt random? The csrng will hide any un-randomness.

> However, this is all predicated on the key being drawn from a uniform distribution

So, we're talking about the generation of the key? Are you trying to say that /dev/random isnt secure (i.e. the source when using algos like SSL)?

viraptor · on Aug 14, 2013

What does "a polynomial attacker" mean? I think I get the idea, but the term is not very well known by google.

chrismonsanto · on Aug 14, 2013

Generally in theoretical cryptography the adversary is a probabilistic Turing machine that can take K steps to break your encryption, where K is asymptotically bounded by a polynomial function of the length of the input it is trying to break.

You will see this called a 'PPT' in the literature, where PPT = Probabilistic Polynomial Time.

danielweber · on Aug 14, 2013

Very generally, if your string is N bits, a polynomial attack is one that can solve it in O(n^i) bits, where i is some fixed number. A non-polynomial time attacker would be something beyond that, like O(2^n).

Technically this treats O(n^2) and O(n^10000) as the same effort.

mtdewcmu · on Aug 14, 2013

I think you can loosely interpret polynomial as "practical." Worse than polynomial typically means exponential, which makes the problem too big to be practical to solve. A polynomial-sized problem is considered practical to solve.