Hacker News new | past | comments | ask | show | jobs | submit login

I've been doing some work parsing Vietnamese text, which has the opposite problem. Compound words (which is most of the vocabulary) are broken up into their components by spaces, indistinquishable from the boundaries between words.



Is that why the name of the country is sometimes spelled with a space, "Viet nam"?


Yes, that's how it is written in Vietnamese. To oversimplify: Vietnamese words are a collection of single syllables that are always separated by a space when writing.

"Viet Nam" is also, actually, the "official" English way to write it. (Check how the UN puts it on all their stuff.) However, most Europeans don't do that in their languages, so it usually gets written as Vietnam even by Vietnamese when they're writing European languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: