Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What approach would you suggest for Text classification?
1 point by gerenuk on June 25, 2018 | hide | past | favorite | 1 comment
Hey everyone!

We are trying to solve a problem where we need to classify the articles into the right categories.

Currently, using a FastText to train a model with 100,000 articles categorized into 600 categories. The loss seems to be converging but the precision is not going up, another thing that requires clarification is that can we use pre-trained Wikipedia English embeddings to categorize text.

What would you recommend using FastText or some other algorithm/approach towards this problem?

Any suggestion/ideas would be appreciated.

Thanks.




FastText is state of the art when it comes to word embedding due to its ability to generate embedding for even words it has not seen, so perhaps your problem lies in your model's architecture, are you using convolution neural nets or just basic feed forward networks I have had great success using CNN for text classification, and in your words pre-processing are you filtering out stopwords(very common words in English that throw confusion to a models ability to correctly classify text's).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: