The python code mentioned is still susceptible to timing attacks. Not on the same scale as a bytewise, return on first inequality algorithms, but it's there. The return statement will take longer when the result value is 0, because then it will compare 2 lengths of strings. This could be enough to distinguish between correct and incorrect compares.
Since the usage of zip in that case will only compare the shorter of the strings to the start of the longer string, you can still achieve the timing attack, since the only inconsistency happens on the final comparison.
A better solution would be to check for length equality before doing the zip, or to pad the shorter of the strings compared to the length of the longer. Maybe also to just make the return statement:
Ah, that's a very good point. I've updated the post to return early, since figuring out the short-circuiting behavior across a handful of languages is obviously not my strong suit. Thanks for pointing that out, sophacles.
y/w. This actually brings up an deeper problem too: the vast majority of the timing attacks exist because of little algorithmic optimizations, such as return on first inequality. In most cases such a comparison is desirable, as it is on average 50% faster (depending on specific data characteristics). We as computer people do these sorts of things as second nature, because we have been taught that efficient algorithms are always best. In cases like these, even when I know that there is a major problem being fixed, I cringe at the inefficiencies. It seems also to be a cause of other types of security hole, e.g. "why should I check this data, it should have already been checked, If I do it here, it's a bad redundancy".
Anyway, rambling thoughts aside: thanks for the article.
I believe that Ruby's Digest == method suffers from a similar issue. It uses rb_str_cmp to compare the hex strings, which uses rb_memcmp to do the comparison which uses memcmp.
The implementation of memcmp bails out as soon as it finds a difference:
I get that timing attacks are reproducible in test environments and in a theoretical sense, but have any major real-life attacks been based on a timing vulnerability? Most websites anyone would want to attack will be running off of multiple servers (which will be reading from replicated mysqls, parts of the requests will likely be memcached, the load balancers will likely have fluctuating load etc. etc.), even multiple data centers. The number of requests you'd have to observe before you could have any statistically meaningful information would be huge. And we're talking about sub-millisecond differences (even Google routinely serves pages in the 50ms range, so this difference is tiny).
Seems like other, more traditional problems (weak / poor crypto, social engineering) are much more important problems.
I kind of wonder if people just really like timing attacks because they're so easy to understand, and, once you spot them, so easy to fix.
Crosby et al. conclude:
We have shown that, even though the Internet induces significant timing
jitter, we can reliably distinguish remote timing differences as low as
20µs. A LAN environment has lower timing jitter, allowing us to reliably
distinguish remote timing differences as small as 100ns (possibly even
smaller). These precise timing differences can be distinguished with
only hundreds or possibly thousands of measurements.
If you'd like to bet that this sort of statistical analysis will get harder to perform over time, I'll take your wager. I'm not willing to wait until someone is publicly compromised before I fix security vulnerabilities—I do not work for an employer who's willing to tolerate a few break-ins before fixing the locks.
I'd also argue that a) a timing attack vulnerability is "weak/poor crypto," b) I don't need to write about the importance of other attack vectors (XSS, CRSF, social engineering, telepathy, bats, etc.) while writing about timing attacks, and c) if timing attacks are easy to understand and fix there would be less of them.
This is a mathematical problem like poker. On one hand, it exposes the issue for attack by hackers who didn't know about it before. But it also makes systems in the future more secure. What is the probability that a malicious hacker has found this issue before you and is using it to attack java based systems? What is the probability that after you expose the issue, all the vulnerable systems will be patched with your suggestion?
I'm curious what other hackers think of this. I've found XSS vulnerabilities in sites and other security issues. What do you do when you find these? Do you notify the sites? Sometimes I notify the website of the issue, sometimes I don't. I struggle with the decision myself, but I don't think I'd ever go open with such a vulnerability without first being more determined about getting a response from those who can fix the issue before the hackers find it.
For example, it looks like org.apache.ws.security.processor.UsernameTokenProcessor.handleUsernameToken might be vulnerable (although WS-Security's nonces and timestamps make things more complicated, since you have to generate a timestamp like a week in the future, and then hope that the server does the digest comparison before rejecting you based on the nonce or timestamp):
Not really, if this attack is doable over the internet, which will have latencies randomly[1] distributed (in a normal curve), rand does not actually do anything, well maybe require a few more sample points to get that noise out.
What if you did something like (using the example from the article) A value which shares no bytes in common with the secret digest will return immediately; a value which shares the first 15 bytes will return 15 compares later.
You keep track of the time while doing the comparison, then before spitting back the result you feed the amount of time to a separate function that will sleep (or whatever) a certain amount of predetermined time so that all results process in the same time.
I realize this is a bit excessive, and will slow down your code quite a bit, but if you are really that worried about this type of attack, why not?
You don't really need to. You can guarantee that the right HMAC and the wrong HMAC both take exactly the same amount of time to compare by using a constant-time algorithm like the one in the article without dealing with non-portable, unreliable timers.
(In some languages—hi Ruby!—you might not even have access to reliable millisecond-resolution timing.)
Sleeping for a random amount of time will make it more difficult to mount a timing attack. But not impossible in theory. You just have to do more statistics --- significance tests and the like.
As naz pointed out, you can use sleep(1.0 - time_spent) to accomplish this.
The downside is that while the solution given in the OP will use a constant amount of time, that constant amount of time itself is not fixed. When over the years processors get faster, the code can execute faster, but in the above case it cannot.
> When over the years processors get faster, the code can execute faster, but in the above case it cannot.
Sure it can. You can drop the limit whenever you change processors. In fact, you can tune the limit for a given call to be the max time taken by that call. As the implementation or resources change, the limit changes.
You're better off just using a constant-time algorithm which uses the same number of operations for all inputs. Few systems, if any, have sub-nanosecond-resolution timers, let alone thread schedulers which have quanta of that size.
> You're better off just using a constant-time algorithm which uses the same number of operations for all inputs.
That's easier said that done. Also, you have to come up with one for every security-relevant operation.
And, "algorithm" isn't the issue, implementation is. Given caching and the like, that's really hard to do.
An easy way to implement the "sleep to worst case" idea is to simply time each call and keep track of the max. If a given call instance returns in less time, sleep the difference. (It's probably good to add a decay mechanism.)
Although nanosleep() takes a timespec struct that allows you to give an accuracy with nanosecond precision, nanosleep doesn't really give anywhere near that level of accuracy. Under Linux, the kernel only checks on timers once every "jiffie," which defaults to 1 ms under the 2.6 kernel (on older kernels it was 10 ms, and 10 ms is still a compile option in the 2.6 kernel). Once upon a time, nanosleep would busy-wait if the sleep time given was less than 2 ms, but not anymore. On the other hand, don't be too discouraged—as long as you don't need submillisecond accuracy, nanosleep() is very good…
That means that any sleep will disguise a call's duration to the jiffie, which is a good thing. (If the call's variance is small enough, it may mean that there's no need to figure out how long to sleep.)
Note that caching and branch prediction and the like make "constant time" implementations almost impossible.
No, it will add millisecond-accuracy jitter to the timing of your algorithm. And the attacker will continue to be able to extract microsecond-accuracy data from your timing attack. You need profoundly accurate hardware and a real-time operating system. Your attacker needs… more measurements.
Caching is equally unlikely to play a role. You're probably thinking of DJB's timing attacks on AES (http://cr.yp.to/antiforgery/cachetiming-20050414.pdf), but that attack resolves around the timing of S-box lookups in which the index of the lookup is key-dependent. In the constant-time algorithm proposed by Nate Lawson and others, the only array lookup is the incremented value of a for loop (i.e., constant-time). The other operations are XOR and OR, both of which are constant-time.
Branch prediction doesn't come into play because a constant-time comparison doesn't involve conditionals until all the data has been compared.
A constant-time implementation of AES? Hard. A constant-time implementation of an array comparison? Dead simple.
Since the usage of zip in that case will only compare the shorter of the strings to the start of the longer string, you can still achieve the timing attack, since the only inconsistency happens on the final comparison.
A better solution would be to check for length equality before doing the zip, or to pad the shorter of the strings compared to the length of the longer. Maybe also to just make the return statement:
return (len(a) == len(b)) and (result == 0)
which may be sufficient.