Hacker News new | past | comments | ask | show | jobs | submit login

Links to the relevant papers:

Bag of Tricks for Efficient Text Classification: https://arxiv.org/abs/1607.01759v2

Enriching Word Vectors with Subword Information: https://arxiv.org/abs/1607.04606

Both fantastic papers. For those who aren't aware, Mikolov also helped create word2vec.

One curious thing: this seems to use heirarchal softmax instead of the "negative sampling" described in their earlier paper http://arxiv.org/abs/1310.4546, despite that paper reporting that "negative sampling" is more computationally efficient and of similar quality. Anyone know why that might be?




It is possible to chose between negative sampling (ns), softmax or hierarchical softmax (hs) by using the -loss option.


Cool, thank you!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: