Hacker News new | past | comments | ask | show | jobs | submit login

Hi, thanks for the feedback. There are three main points that makes me believe that the clusters aren't artificial: the first one is that I've made the clusters with DBSCAN on the original data (100-d word vectors) and not after the t-SNE embedding. The second point is that I manually inspected some clusters and they make sense when compared to the similarity queries on the word2vec model (take a look on the star names for instance). The third point: I took the folios from the two main clusters (red/blue) (https://www.reddit.com/r/MachineLearning/comments/419e5a/voy...) and they seem to match with the folios from the two languages hypothesis.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: