Hacker News new | past | comments | ask | show | jobs | submit login

Does the same thing with Russian as well.

(I wonder how it would tell the difference between Russian, Ukrainian and Bulgarian though?)




There are other romanisations of Korean besides, although this is the government official one.

For maximum confusion there are also cyrillizations, how would it go with that:)


>>> import langdetect

>>> probabilities = {x.lang: x.prob for x in langdetect.detect_langs(text)

Works for me, for differentiating Russian and Ukrainian text.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: