*However, using a Hilbert Curve for sharding doesn't seem like the best approach...

n4r9 · on July 20, 2018

In my experience the linked algorithm behaves quite poorly when the dataset and number of desired clusters becomes large. I think the issue is that it only allows for pairwise switching between clusters, and this ends up with a lot of points still being assigned to clearly the wrong cluster. Some day I would like to try more complex neighbourhood searches with it and see if it helps.