Hacker News new | past | comments | ask | show | jobs | submit login

Modeling words co-occurrence graph and then pruning "weak" edges (or achieving similar pruning by using community detection to find clusters) works kind of like a "feature selection" based on something that resembles a bare mutual information or tf*idf.

I'm not entirely familiar with LDA, but from what I was able to understand from their intro, it feels like their LDA application could have used some feature selection.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: