PostgreSQL's Hash Indexes Are Now Cool

tofflos · on Sept 27, 2017

Are there any plans for allowing hash indexes in uniqueness constraints such as the ones created for primary keys? It seems like a good fit for an index that is specialized for equality checks.

snaky · on Sept 27, 2017

Maybe it would be easier just to remove "A foreign key must reference columns that either are a primary key or form a unique constraint" restriction, stating that any unique index is enough.

anarazel · on Sept 27, 2017

> Maybe it would be easier just to remove "A foreign key must reference columns that either are a primary key or form a unique constraint" restriction, stating that any unique index is enough.

The problem is that only btree indexes support uniqueness atm. That's the relevant unsupported features, not the ability to have constraints (which essentially just requires uniqueness support of the underlying index):

    postgres[27716][1]# SELECT amname, pg_indexam_has_property(oid, 'can_unique') FROM pg_am;
    ┌────────┬─────────────────────────┐
    │ amname │ pg_indexam_has_property │
    ├────────┼─────────────────────────┤
    │ btree  │ t                       │
    │ hash   │ f                       │
    │ gist   │ f                       │
    │ gin    │ f                       │
    │ spgist │ f                       │
    │ brin   │ f                       │
    └────────┴─────────────────────────┘
    (6 rows)

Edit: different uses of word constrain (to constrain, and a constraint) seemed too confusing.

jontro · on Sept 27, 2017

Is the hash guaranteed to be unique though? There is always a possibility of a hash collision

anarazel · on Sept 27, 2017

> Is the hash guaranteed to be unique though? There is always a possibility of a hash collision

I mean my point is that hashindexes do not support uniqueness right now. But hash collisions wouldn't be a problem there. Does mainly require some tricky concurrency aware code (consider cases lik ewhere one transaction just deleted a conflicting row but is still in progress, and a new value like that is inserted, etc).

smitherfield · on Sept 27, 2017

Perhaps I'm missing something, but wouldn't the most efficient possible hash function for a set of unique integers be identity? So I don't see how an ID column would benefit.

anarazel · on Sept 28, 2017

You can use such a hashfunction, but it'd not be a good one for general purpose. Remember that hashtables, and also our hash-indexes, use buckets / ranges of hash values to keep the hash-table at reasonable sizes. If you don't use a hash function that has a good bit perturbation, you can end up with a lot of values in the same bucket.

Consider e.g. the common implementation where buckets are determined by masking out either the lowest or the highest bits of the hashvalue. If you mask out the highest bits and your values aren't sequential (pretty common), you end up with a lot of collisions. More extremely, if you mask the low bits and shift, if you only have small values everything ends up in the first bucket. Therefore what you want is a hashfunction where a one bit change at "one side" of the input value, is likely to affect most of the remaining bits.

misterbowfinger · on Sept 27, 2017

Wasn't sure what a hash index was vs. btree

Short version - hash indexes are faster in PG11, but they only apply to "where = foobar" queries, giving a 0(1) time. Btree indexes have O(logn)

But hash indexes can't be applied to range clauses, like "where < 50". You can still use a btree index however.

SO post:

https://stackoverflow.com/a/398921

colanderman · on Sept 27, 2017

Keep in mind that the log n factor of a B+tree is generally very low; B+tree branching factors are typically in the 100s. Also, the first few levels are generally kept in cache, so you'll only have to hit disk for inner nodes past 100 million entries or so.

Finally, hash indexes always require that the found row be confirmed in the data table, even for simple existence queries, since the keys themselves aren't stored in the hash table. (This is why hash indexes can't be UNIQUE.) B+trees can often answer such queries without the extra lookup (an "index-only scan"). If your B+tree is so large that its inner nodes spill onto disk (necessitating a 2nd disk seek), chances are the equivalent hash index will as well, which, combined with the consult of the data table, kind of negates the benefit.

ankrgyl · on Sept 27, 2017

You're right, but I don't think the use case you mentioned (as a competitor to B+ trees for direct lookups) is the target use case for a hash index.

A hash index can be a big win for nested loop joins, especially at high concurrency. It is quite common to build a hash table over a subset of the inner table of an equijoin. This is (a) slow to construct and (b) memory-intensive (especially if many of these queries are run concurrently).

With a hash index, a lot of cases that required building a hash table to speed up a query can just use the hash index directly. Furthermore, every concurrent instance of the query can use the same hash index. This is a big win for both performance of a single query (latency) and query scalability.

koolba · on Sept 28, 2017

That's a hash join which has existed in Postgres for many years. Hash indexes are unrelated.

The primary use case of hash indexes is situations where the indexes fields are very large. The hash index uses a fixed size regardless of the width in bytes of the key so it "wins" storage wise when the key is wide.

anarazel · on Sept 28, 2017

> That's a hash join which has existed in Postgres for many years. Hash indexes are unrelated.

I think the OP ankrgyl is aware of that. To quote:

>> It is quite common to build a hash table over a subset of the inner table of an equijoin. This is (a) slow to construct and (b) memory-intensive (especially if many of these queries are run concurrently).

koolba · on Sept 28, 2017

Ah I think I misread that section with the "nested loop join" stuck in my mind.

anarazel · on Sept 28, 2017

I think the win for that kind of scenario is smaller than you make it out to be. I agree on memory usage, but the concurrency argument seems largely faulty because you at the same time increase page-level contention on a shared resource (postgres' buffer cache).

I personally want a lazily filled hash-join / something like a index nested loop join that has a fixed size cache for both positive and negative lookups.

colanderman · on Sept 28, 2017

> I don't think the use case you mentioned (as a competitor to B+ trees for direct lookups) is the target use case for a hash index.

Exactly – my comment is directed at those who think they might be, because O(1) sounds way better than O(log(n)). When really, except for data sets in the 10s of billions of rows, there are more subtle reasons to use them (as you point out).

colanderman · on Sept 28, 2017

Too late to edit now but "100 million" should read "20 billion"… I missed a tree layer. (Assuming a branching factor of 200 and 4 GiB of index cache: 500 thousand inner nodes of 8 KiB each can fit in cache, corresponding to 100 million leaves, which can contain 20 billion entries.)

So hash indexes really only begin to show benefits for individual lookups when your data is terabyte-scale, and below that can even be harmful for that use case (if you could otherwise benefit from an index-only lookup). But see ankrgyl's comment (sibling to this one) for better reasons to consider hash indexes.

anarazel · on Sept 28, 2017

> Finally, hash indexes always require that the found row be confirmed in the data table, even for simple existence queries, since the keys themselves aren't stored in the hash table. (This is why hash indexes can't be UNIQUE.)

I don't think that's really true - even for btree indexes the heap is accessed to check row visibility. That's necessary because postgres doesn't store visibility information in indexes. There's imo no really big problem making hash indexes support uniqueness - it's "just" work.

colanderman · on Sept 28, 2017

The heap need only be accessed if the visibility bitmap indicates it ought to be. (Otherwise index-only scans would be pretty useless.)

"Visibility information is not stored in index entries, only in heap entries; so at first glance it would seem that every row retrieval would require a heap access anyway. And this is indeed the case, if the table row has been modified recently. However, for seldom-changing data there is a way around this problem. PostgreSQL tracks, for each page in a table's heap, whether all rows stored in that page are old enough to be visible to all current and future transactions." [1]

But you are right regarding UNIQUE. No reason it couldn't be implemented for hash indexes.

[1] https://www.postgresql.org/docs/9.6/static/indexes-index-onl...

anarazel · on Sept 28, 2017

I think you might have misunderstood what I meant - I was talking about the index insertion case, to verify uniqueness. And that indeed checks the heap regardless of the VM bit. Check _bt_doinsert() invocation of _bt_check_unique() and the latter's content. Therefore there's no fundamental problem with having to do the same in hash insertion cases, even if more heap fetches are likely to be necessary.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f...

colanderman · on Sept 28, 2017

Ah gotcha, I thought you were making two separate comments. Makes sense.

Terr_ · on Sept 28, 2017

> Finally, hash indexes always require that the found row be confirmed in the data table

Probably to avoid returning the wrong result for statistically-inevitable collisions, right?

agacera · on Sept 28, 2017

Probably it is needed to check if the row is still visible to that transaction.

colanderman · on Sept 28, 2017

Postgres uses a visibility bitmap [1] to avoid hitting the heap merely to determine visibility of stable data.

[1] https://www.postgresql.org/docs/9.6/static/indexes-index-onl...

colanderman · on Sept 28, 2017

Correct.

snaky · on Sept 27, 2017

> We can see here that the hash index performs better than the btree index and the performance difference is in the range of 10 to 22%. In some other workloads we have seen a better performance like with hash index on varchar columns and even in the community, it has been reported that there is performance improvement in the range of 40-60% when hash indexes are used for unique index columns.

the link is from the post http://amitkapila16.blogspot.com/2017/03/hash-indexes-are-fa...

slagfart · on Sept 28, 2017

Don't forget that the performance improvement of hash over b-tree gets better for wide keys - the wider the key, the better the improvement, as the key width is irrelevant to the hash index.

Based on the width of the key, the row count and the amount of memory available, there might be instances where a disk-hit gets replaced with an in-memory cache, which is awesome for spinning-disk based systems.

snaky · on Sept 28, 2017

> the wider the key, the better the improvement

Good point, I think it would be nice to add it to the docs.

Maybe with part numbers of goods as an example - in typical OLTP DB they are variable width strings, mostly read-heavy, unique (not as constraint but as the matter of fact) and searched by almost constantly.

twoodfin · on Sept 27, 2017

Hopefully Postgres will also use them for joins? That’s the real performance win, since an equijoin is like N * WHERE X = Y.

anarazel · on Sept 27, 2017

Yes, it will for index nested loops. But e.g. a mergejoin can't really benefit from a hash index, in contrast to a btree index which can provide the ordering without a sort step.

lathiat · on Sept 28, 2017

It is for this reason that the InnoDB Adaptive Hash Index exists: https://dev.mysql.com/doc/refman/5.7/en/innodb-adaptive-hash...

> Based on the observed pattern of searches, MySQL builds a hash index using a prefix of the index key. The prefix of the key can be any length, and it may be that only some of the values in the B-tree appear in the hash index. Hash indexes are built on demand for those pages of the index that are often accessed. > If a table fits almost entirely in main memory, a hash index can speed up queries by enabling direct lookup of any element, turning the index value into a sort of pointer. InnoDB has a mechanism that monitors index searches. If InnoDB notices that queries could benefit from building a hash index, it does so automatically.

I am really excited by some of the PostgresQL developments lately.. in particular with parallel query execution. To the best of my knowledge the only place that currently exists within MySQL (or MariaDB) is MySQL Cluster.

ww520 · on Sept 27, 2017

I think hash index help in join as well. "where table1.c1 = table2.c2" A hash index on c2 would let the join be done in O(n) of c1 + O(1) of c2.

alexnewman · on Sept 28, 2017

For now, the big problem with range queries in my mind would be how you get it to return in any reasonable order in any reasonable amount of time.

zitterbewegung · on Sept 27, 2017

I hope Postgres keeps on getting Cooler. Reading all the Change Logs makes me feel warm and fuzzy inside .

TexanFeller · on Sept 28, 2017

I remember someone commenting a while back that with hash indexes allowing you to navigate relationships in O(1} time, relational DBs can approximate the perf characteristics of a graph DB. When I did a quick and dirty test a couple years ago(before they would have been usable anyway due to durability and replication) I found that hash indexes performed noticeably worse than btree for navigating a few test tables with 10-100m rows each. Curious whether this is a major enough improvement for hash indexes that btree will not still be faster for many common equality lookups.

qaq · on Sept 27, 2017

Any good articles that go into detail on when to use hash indexes in PG 10?

Annatar · on Sept 28, 2017

Is it just me, or has PostgreSQL kept on getting better and better, to the point of being #1 DB in the open source market?

alexnewman · on Sept 28, 2017

We have been talking about this for a while. Consider the coming of more distributed and nvme based storage and a lot of these things eventually make sense.

frandroid · on Sept 27, 2017

> A report from a tester who goes by "AP" in July tipped us off to the need for a few further tweaks. AP found that trying to insert 2 billion rows into a newly-created hash index was causing an error.

This is what I call trying things "at Indian scale" :D