"Assume the attacker can tell how long it takes for mac == computedMac to run. If the first byte of an attacker-chosen mac is wrong for the attacker-chosen message, the loop terminates after just one comparison. With 256 attempts, the attacker can find the first byte of the expected MAC for the attacker-controlled message. Repeating this process, the attacker can forge an entire MAC."
How precisely should an attacker guess how long the comparison runs?
This is white-box security, a hypothetical setting where we assume the attacker has access to the entire knowledge of the system and to every oracle they want (like an oracle telling them how much time each function takes), but don't know any secret, like private or symmetrical keys. If you can prove that your function is secure in that setting, then it's secure in real-case situations where the attacker knows even less.
This is a timing attack or timing oracle. Lets assume a mac represented in an array of 32 bytes. If we had a pseudocode method like:
byte [32] (actualMac, expectedMac)
for int x = 0..31
if (actualMac[x] != expectedMac[x])
return false;
fi
end
return true;
We return false as soon as we hit an invalid byte in our calculated mac. If the time taken to execute one iteration of the loop is Y and the attacker is able to time this method accurately they will be able to tell what the value of actualMac is by feeding known inputs. They will know because the return time will be 2Y when they have bailed after the first byte. 3Y after the second, 4Y after the third etc.
This is why we should check the arrays in constant time - compare every byte in both arrays before returning. We do not return early so we can’t leak information
why is it called constant time if it isn't constant with respect to array length? Just seems confusing because the algorithm is linear without a short circuit
It's constant time in that it always takes the same amount of time regardless of the extent to which the two strings are equal. It is a different concept than constant time in complexity analysis.
What's even more confusing is that it is also constant time in the complexity analysis sense given that the mac is usually a fixed-size string after choosing a hashing algorithm.
Isn't it sufficient to compare 64 bits at a time? Then the oracle becomes rather useless.
Many current memcmp implementations use such large comparisons because they avoid hard-to-predict data-dependent branches for extracting the specific point of mismatch.
Don’t really buy it. Seems to be both “spherical cow optimistic assumptions” and “anyone who could seriously think about pulling this off has nation-state level resources and already 0wnz you and/or already has the rubber hose at hand"
Not really. It doesn't rely on that big of an assumption, nor does it require nation state resources[0]. When you're trying to find the secret you can make a bunch of requests and measure for statistically significant change, which can still be detectable beyond jitter & web server load.
Also ignoring the fact that calling constant_strcompare(string, string) instead of strcompare(string, string) when working with secrets isn't that big of an ask.
If you could measure the time granularly as a client requesting some resource on the server how exactly would you know the time corresponds to the comparison and not to some tangential task?
They wouldn't be guessing, they'd be measuring. I'm not qualified to really explain more but if you want to learn more, "timing attack" is what you're looking for
(Note that Java’s MessageDigest.isEqual has been constant time since shortly after that article and you should use it rather than writing your own in Java).
You guess the MAC tag value of the message and measure how long it takes the server to return "bad MAC" error or behave in a way that means the MAC was bad. In 1/256 cases, it takes longer because the first byte was correct. You may need to send many queries to get the value because timing is noisy, but with statistics you'll find that value. Now you try all 256 possible values of the second byte and one of them will take longer because both 2 first bytes are correct. Repeat.
So-far, all comments on this thread are about the general concept of timing attacks...
You're asking a different question, though. You're asking about precision.
The answer here is that in many cases timing attacks pose a theoretical risk, but they can't be exploited in practice due to a low signal-to-noise ratio.
It really depends on the attack vector.
Measuring the latency of a network call (TCP) from across the other side of the world, as an example, is going to be too noisy (in many cases). Especially if the attacker wants to remain covert.
"Assume the attacker can tell how long it takes for mac == computedMac to run. If the first byte of an attacker-chosen mac is wrong for the attacker-chosen message, the loop terminates after just one comparison. With 256 attempts, the attacker can find the first byte of the expected MAC for the attacker-controlled message. Repeating this process, the attacker can forge an entire MAC."
How precisely should an attacker guess how long the comparison runs?