Hacker News new | past | comments | ask | show | jobs | submit login

The reason why it can "make up words" is because it does not use "words", but "tokens" (which can be smaller or larger than a single word).

In this specific case, it probably understands that the token "nie" can be prepended to (almost) any polish word (like "un" in english) to generate a negation of that word.

Cool story, though.

EDIT: Note that (for example) Google Translate has no problem tackling the word "niemiarytmiczny" and "correctly" translating it into english.




It’s not about “nie” (as indeed, appending it to adjectives does form negations). The word “miarytmiczny” does not exist either. However, it will likely be understood by native speakers anyway, as the adjective made from the noun “miara”, meaning “measure”, even though the correct derivative adjective is “mierzalny” (measurable).


Thanks for the correction.

In that case, Google Translate's attempt at parsing that word completely failed: it seems to interpret it as "niemi-arytmiczny" or "niemia-arytmiczny", rather than as "nie-miarytmiczny". Funny.

EDIT: DeepL's attempt at translating (https://www.deepl.com/translator#pl/en/niemiarytmiczny) are also funny and includes options such as "non-marithmetic"(?)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: