Hacker News new | past | comments | ask | show | jobs | submit login

It would be interesting to see what Agglomerative Clustering the author is using here. I suspect for this two dimensional, density based cluster dataset, single-link agglomerative would perform much better than what is shown (likely average link).



It was actually Ward. I agree that single linkage would have performed better, however the noise would have greatly confused the issue. Robust Single Linkage (Chaudhuri and Dasgupta, 2010) would be the better choice in the presence of noise as with the test dataset (which was designed to be as difficult as possible for clustering algorithms, while being obvious to a human viewer.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: