HighwayHash: Fast hashing at over 10 GB/s per core in Golang

pjscott · on March 22, 2019

This is a pretty sweet hash function, with security guarantees similar to SipHash but about 5x faster. Here's more info:

https://github.com/google/highwayhash

DenisM · on March 23, 2019

Wicked.

  BenchmarkHighwayHash            11986.98 MB/s
  BenchmarkSHA256_AVX512           3552.74 MB/s
  BenchmarkBlake2b                  972.38 MB/s
  BenchmarkSHA1                     950.64 MB/s
  BenchmarkMD5                      684.18 MB/s
  BenchmarkSHA512                   562.04 MB/s
  BenchmarkSHA256                   383.07 MB/s

nneonneo · on March 23, 2019

HighwayHash is explicitly called out as being not cryptographically secure. Every other hash in your list is (or was) considered cryptographically secure. Therefore these aren’t really comparable, and it was somewhat disingenuous of the authors here to make that comparison.

CJefferson · on March 23, 2019

I think somewhat disingenuous is a bit soft, given these people should know better, in my mind it is scientific misconduct to only compare against cryptographicly secure hashes.

arcticbull · on March 23, 2019

Seems a little disingenuous to compare it to those hashes, which have very different guarantees re: hash collisions as they point out in the first section. Seems like SIP would be the better point of comparison, unless I've missed something.

fro0116 · on March 23, 2019

I'm curious what the article is referring to when it comes to collision resistance. I was looking at the GitHub repo posted below and saw only claims of "strong" collision resistance: https://github.com/google/highwayhash

Unfortunately I'm not familiar enough with the details of the collision resistance properties of the other hash functions here to know how they compare. Could someone elaborate on the specifics of why HighwayHash compares poorly in terms of collision resistance to the others in the benchmark?

srgpqt · on March 23, 2019

Would be interesting to see how MeowHash compares.

y4m4b4 · on March 26, 2019

For all enthusiastic folks you may want to read more here on SipHash comparisons

https://github.com/fwessels/HashCompare/blob/master/README.m...

https://github.com/fwessels/HashCompare/issues/1

microcolonel · on March 23, 2019

It is a very cool function. It'd be interesting if a more thorough cryptanalysis were performed (though if you read the Google Research paper, you get an impression that they are breaking fresh ground, so there would be a lot more effort involved in such an analysis). I think this function or ones like it may not only be very fast on AVX2, but on some other systems as well.

m0zg · on March 23, 2019

Would be cool to have something like this in the standard library. This is getting close to being memory bound. In fact it will be memory bound on most amd64 machines if you use more than 2 cores. It's kinda pointless to optimize much further, even if further optimization is possible.

inamberclad · on March 23, 2019

Is it really written in Go if they're using optimized assembly to implement parts of it?

dana321 · on March 23, 2019

Its written in C++ and Assembly, the go library is just a wrapper

y4m4b4 · on March 26, 2019

There is no C++ in the code, it is a native Go library of Google's HighwayHash implementation

dev_dull · on March 23, 2019

I think pretty much all of the “fast hash” programs drop into assembly, even C. Or at least that’s my impression.

lugg · on March 23, 2019

Curious about why they do this, is it for memory management (gc avoidance) or is it for something like avoiding optimisation tradeoffs, or is it just a mixof concerns?

georgyo · on March 23, 2019

I don't think memory is the issue. Hash algorithms are just complicated math. This math is now targeting the CPU architectures themselves.

It's easier to make it faster to write it in assembly were you can hand tune all the instructions when you are already thinking that low level.

Perhaps not a concern for Blake, but timing attacks are real. It you need to make your function operate constant time then your compiler is likely working against you.

1_person · on March 23, 2019

Golang proudly serves as the archetype of a particular kind of language design hubris which seems to be spreading like a disease lately, that although feature X was required for my implementation you (shouldn't need it / can't be trusted with it). It's famously the language which aims to be "90% perfect 100% of the time". In practice, this should be interpreted as being measured on a logarithmic scale for new algorithms or novel implementations using newer/obscure hardware features.

The standard distribution of Golang has been historically non-receptive to the notion of providing intrinsics, the capability for inlining assembly function calls (note: not inlining assembly source), and in general proposals to trivially extend the language or standard library to address performance concerns whose need would not be considered controversial in any other context (I'm not talking about generics, here). In quite a few cases it has been communicated that certain features will never be made available because fuck you that's why, although in recent releases there has been a trend of backtracking on these "promises" under the onslaught of justifiable need.

These conscious "trade-offs" severely impact the specific cases where dedicated hardware instructions exist that are capable of providing sometimes orders of magnitude improvement but are not yet encapsulated by builtins or blessed with snowflake exceptions in the standard library.

Your options for implementing the kinds of compute-bound tight loops which benefit from specialized instructions in Golang are: 1) Write the whole thing in Golang assembly so you only pay the price of a function call once on entry (Let's call this 0.4ns for a 2.5GHz CPU using a single cycle latency instruction per operation) 2) Use hilarious bit twiddling hacks that the compiler can inline which perform the equivalent operation with 10-20 cycle latency (4-8ns) 3) Use a Golang assembly library which presents a single invocation of the hardware instruction as a Golang function and have all of the gains absorbed by the function call overhead (~10ns) 4) Use the primitives available via the standard library to implement the function (as a general rule not worth wasting the time to optimize and comparatively benchmark for this class of function) 5) Fork golang, implement what you need, offer it as a pull request, and have upstream tell you to just go back to C or assembly if you care about performance (Literally.) 6) Target gccgo/(llgo?) (much better for this specific class of function but less performant in other areas, a different set of trade-offs).

In short: optimization trade-offs.

rurban · on March 23, 2019

They claim the same "fastness" as Siphash whilst actually being on the slower end of all good hash functions. Too slow for real world usage, and not secure enough for real world security attacks. Compare https://github.com/rurban/smhasher and my complaints https://github.com/google/highwayhash/issues/28

saagarjha · on March 23, 2019

No offense, but you don't come off very nicely in that thread. Perhaps you could rephrase your criticism in a way that would make the maintainers more likely to listen to you?

rurban · on March 23, 2019

No, it lead to actual harm on billions of users. And Google did listen and reacted accordingly.

But they still didn't put their hash to external scrutinity, still waiting for the pull request to get it tested with smhasher.

saagarjha · on March 23, 2019

Still, there's no need to be rude.

rurban · on March 23, 2019

I'm never rude.

saagarjha · on March 24, 2019

…I personally find that very hard to believe. Maybe you weren't specifically "rude" here, but you were clearly unpleasant–you called their claims "false" and "nonsense" repeatedly, which is rarely if ever necessary.

rurban · on March 24, 2019

Calling false claims "false" is necessary and not rude.

And I just added Highwayhash now to smhasher by myself and my initial analysis was confirmed. It's the 2nd slowest of all tested good hashes, only behind Siphash. Every other is simplier and faster. Chi-Square on the lower 32bits was 0.00 so it's really a good one. But I see no usecase for it, really.

gliptic · on March 23, 2019

The WOOL 2009 paper where it takes days to attack a 14-bit key is irrelevant for attacking a 64-bit key. They even say that 32 bits would make the attack infeasible.

Offline attacks to find collisions are irrelevant because revealing the key breaks the security model.

Saying you can recover the key from side-channel attacks on probe sequences is actually a claim that SipHash/HighwayHash are not secure PRFs.

If you're not making these claims, you're not communicating very well what your security claims are.

rurban · on March 24, 2019

Attacking a hash table needs not more than 14bits. This is enough to produce enough collisions to break a typical bad table. If a hash function produces 14 or 32 or 64 bits is irrelevant if only the lowest needed bits are used.

My hash tables are secure against known seeds BTW, every hash table should. Almost all use cases make it trivial to get to the seed in certain ways. Calling a hash table secure relying on a secured random seed is theatre, and securing is actually simplier and faster than relying on a broken model and slow hash functions.

gliptic · on March 27, 2019

I think you're confusing the key/seed with the output. The paper is about key sizes, not hash output sizes. Again, if you claim it's trivial to recover the seed in "almost all use cases", you should be able to demonstrate that pretty easily, no?

I don't have anything against recommendations to use all bits in the hash output, or good collision handling, just these unfounded security claims about key recovery.

justinclift · on March 23, 2019

Title should probably mention "(2018)".