Hacker News new | past | comments | ask | show | jobs | submit login

These all look like clustering based on a 2d space, but does anyone know methods to tackle clustering on a network?

Is it just a matter of tweaking the definition of density / distance to the number of hops, or is it a different problem entirely? I can see how with 0 or 1 hops the data would be a very smushed distribution, versus 2d distance is much more rich and spread out.




Clustering algorithms generally only need a distance metric. Here 2d space is used for illustration just because it is easy to visualize.

For n-dimensional space, geometric distance is often (but not always) used, but you can just use # of hops instead of that.


Brilliant, that's what I figured, thank you!


If you want to use an HDBSCAN algorithm on graphs then I suggest you look into Spectral Clustering (which traditionally uses K-Means, but could use HDBSCAN instead. Otherwise you may want to consider graph specific algorithms such as Louvain.


Looks at spectral clustering. Spectral methods usually take the distances between points to generate a graph, then operate directly on that graph. If you already have a graph/network, you can use those methods directly on it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: